Abstract
Cardiovascular disease is a leading cause of mortality and rising healthcare costs worldwide. Fortunately, the disease is preventable, and addressing risk factors can significantly reduce its effects. Over the past decade, risk prediction models have advanced significantly, with polygenic risk scoring analysis, which is often used in combination with clinical health information for prediction. However, most previous cardiovascular disease prediction studies based on polygenic risk scores have focused on a single specific disease or event, such as cardiac events. Given the complex nature of the cardiovascular disease, which involves a combination of genetic and environmental factors, a comprehensive analysis of the disease prediction results is essential. In this study, we investigate the genetic and environmental factors contributing to cardiovascular disease by utilizing data from the Framingham Heart Study, a leading cardiovascular cohort. We compared the prediction performance of different methods across various scenarios and assessed performance using various evaluation metrics to identify the best-fitting model for six cardiovascular related diseases. We also analyzed the feature importance of genetic and clinical variables, noting that different variables had varying effects on each disease. Our findings demonstrated the performance of prediction algorithms in forecasting cardiovascular disease by utilizing genetic and clinical factors, as well as highlighting the importance of each feature in the disease prediction. While models relying solely on polygenic risk score showed relatively low prediction performance for some diseases, integrating genetic information with clinical data improved prediction performance in most cases. For certain diseases, particularly those known to be heritable, polygenic risk scores demonstrated predictive ability, suggesting that they may serve as standalone predictive tools. We believe our study reveals the value of combining polygenic risk scores with clinical variables and expect that our thorough analysis can inform study designs tailored to specific diseases and research objectives.