Discovering sequential patterns and interrelations among multiple diseases in electronic medical records using cSPADE algorithm

利用cSPADE算法发现电子病历中多种疾病的序列模式和相互关系

阅读:2

Abstract

BACKGROUND: The intricate relationships between diseases are characterized by the sequence and temporal intervals of their onset, which are critical for understanding the essence of comorbidity and predicting disease progression. This study seeks to investigate the interdependencies and chronological order of various diseases that occur in the same patient by employing sequential pattern mining algorithms. Specifically, the research endeavors to delineate the disparities in the time intervals between the onset of distinct disorders and to scrutinize the concordance and discordance in disease sequence patterns across gender groups. METHODS: Patient identity information, visit dates, and diagnostic data were aggregated from the electronic medical record databases of three large general hospitals. The diagnostic information included the International Classification of Diseases, Tenth Revision (ICD-10) codes, along with their corresponding descriptions. A total of 1,060,344 diagnostic entries from 269,973 patients who visited during 2012-2022 were incorporated into the mining model, which was constructed using the Sequential Pattern Discovery using Equivalence Classes (SPADE) algorithm. RESULTS: A total of 212 highly supported sequential pattern rules were ultimately identified, most of which were related to disorders of the endocrine and circulatory systems. In 66 patterns, the order of disease incidence or diagnosis was relatively well-defined. The time interval between the onset of two diseases ranged from 1 to 2 years in most patterns. For patterns with short-term relationships, the interval was less than 2 months, whereas in some cases, the interval extended to 5 to 10 years. Among the extracted patterns, 176 exhibited stronger support in the male dataset compared to the female dataset. Patterns related to cardiovascular and liver diseases were more prevalent in males, while those associated with orthopedic and endocrine disorders showed higher prevalence in females. CONCLUSION: Our findings demonstrate the effectiveness of the constrained SPADE (cSPADE) algorithm in comorbidity research and highlight several clinically significant sequential comorbidity patterns. These patterns are expected to contribute to disease prevention, etiological research, and the development of clinical decision support systems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。