Abstract
BACKGROUND: Systemic anti-cancer and radiation therapies for lung cancer often cause adverse events that impair quality of life. Therefore, it may be counterproductive if patient survival is too short for any benefit. Decision support tools predicting short-term mortality in patients with lung cancer have been investigated using data from the Danish National Patient Registry (DNPR). However, the extent of the DNPR's information on clinical manifestations of anti-cancer treatments, particularly in lung cancer, remains unknown. We assessed the validity of clinical manifestation reporting in the DNPR to evaluate its usability in decision support tools and epidemiological studies. PATIENTS AND METHODS: Ninety-five patients treated for lung cancer at the Department of Oncology, Aalborg University Hospital, who died between 2015 and 2022 were randomly selected. Four raters independently extracted clinical manifestations from patients' journals. Interrater agreement was measured using Light's kappa (threshold ≥ 0.41). Extracted manifestations were matched to DNPR data, and concordance was quantified using the F1-score, aggregating by date, week, month, quarter, year, and patient. RESULTS: Interrater agreement ranged from moderate to perfect, except for constipation. Concordance between journal and DNPR data was low for most clinical manifestations. F1-scores for dyspnea, hemoptysis, and pain, almost exclusively at the patient level, suggested moderate reporting in the DNPR. CONCLUSION: Concordance between the DNPR and patient journals was limited for minor and moderate clinical manifestations, but severe manifestations, such as dyspnea, hemoptysis, and pain, showed moderate agreement, showcasing some usability in this limited context. This study also demonstrates the availability of rich data about clinical manifestations in journal notes. Despite potential biases in these data, this study highlights the potential in the extraction of clinical manifestations directly from patient journals, notably using Natural Language Processing, to obtain more detailed data for epidemiological studies or to build machine learning-based decision support tools.