Abstract
22q11.2 deletion syndrome (22q11DS) is associated with a variety of complications, including mental illness, intellectual disability, and physical disorders. Due to the overlap of these conditions, there is often a mismatch in existing healthcare frameworks, leading to unmet support needs. However, little is known about how patients and their caregivers perceive these issues. This study aims to automate thematic analysis (TA) and topic classification via natural language processing (NLP) techniques to extract medical needs from qualitative data provided by patients' caregivers. A web-based survey was conducted targeting caregivers of individuals with 22q11DS in Japan. 125 caregivers participated in the study and their responses were analyzed to identify medical challenges and unmet needs. TA and NLP-based thematic extraction was implemented on free-text responses related to medical concerns. To ensure privacy, the analysis was conducted offline using the open-source large language model (Cohere Command R Plus). Ethical considerations were addressed following the Declaration of Helsinki, with approval from the Ethics Committee of the University of Tokyo Graduate School of Medicine and Faculty of Medicine. NLP unveiled medical challenges and unmet needs of individuals with 22q11DS: a child-centered approach, comprehensive support across medical and welfare services, and support for caregivers through social and community networks. A comparison with manual TA confirmed that most themes were consistent. Given the limited size of our dataset, the implications of this study should be regarded as preliminary. Nevertheless, our findings suggest that NLP may serve as a useful exploratory approach to complement manual TA when analyzing larger free-text datasets on medical needs in future research. NLP should not be viewed as a replacement for manual TA, but rather as a supportive method that offers additional perspectives and potential patterns.