Abstract
OBJECTIVE: Growing complexity of oncological treatment is reflected in the requirements for current clinical trials, challenging clinical sites with recruiting suitable participants. This cross-sectional study evaluates the potential of artificial intelligence (AI), based on the example of ChatGPT-4.0, in identifying suitable study participants among patients with breast cancer, utilizing real-world tumor board data. METHODS: ChatGPT-4.0 was trained on six fictitious study protocols for patients with breast cancer, mimicking real-world clinical trial scenarios. Anonymized data from 124 tumor board registrations from January 2024 were submitted to the AI to determine eligibility for study participation. A clinician control group also assessed the patients' eligibility. The evaluations of ChatGPT-4.0 and the medical professionals were benchmarked against an expert-validated reference standard. Sensitivity and specificity were calculated for the AI as well as for each member of the control group. RESULTS: Overall, among the 124 tumor board registrations, 19 patients met eligibility criteria for at least one study. Both AI and clinicians reliably excluded ineligible patients (high specificity), but sensitivity varied. ChatGPT-4.0 proved especially ineffective at screening for neoadjuvant trials, whereas medical professionals showed better, but heterogeneous performance. Team-based assessment identified nearly all eligible patients, underscoring the value of collaborative decision making. CONCLUSION: While model performance was limited by simplified input data and a small single-center cohort, the results suggest that ChatGPT-4.0, in its current form, is not yet suitable as a stand-alone tool for patient identification in clinical breast cancer trials. To ensure accurate and efficient recruitment, the involvement of a multiprofessional team remains essential. Ongoing model refinement and access to larger, more detailed datasets may enhance the future utility of AI systems in clinical trial screening.