Abstract
PURPOSE: Medical education professionals expect artificial intelligence (AI) systems to be an efficient faculty resource for content creation. However, prior findings suggest that machine learning algorithms may exacerbate negative stereotypes and undermine efforts for diversity, equity, and inclusivity. This investigation explores the potential of OpenAI's ChatGPT (OCG) and Microsoft's Bing A.I. Image Creator (MBIC) to perpetuate ethnoracial stereotypes in medical cases. MATERIALS AND METHODS: A series of medically relevant vignettes and visual representatives were requested from ChatGPT and MBIC for five medical conditions traditionally associated with certain ethnoracial groups: sickle cell anemia, cystic fibrosis, Tay-Sachs disease, beta-thalassemia, and aldehyde dehydrogenase deficiency. Initial prompting, self-prompting, and prompt engineering were iteratively performed to ascertain the extent to which AI outputs for generated vignettes and imagery were mutable or fixed. RESULTS: The ethnoracial identity in the vignettes of the clinical conditions adhered more closely than described in epidemiologic studies. Following prompt engineering and self-prompting, an increase in diversity was seen. On initial prompting, the most common ethnoracial identity depicted was Caucasian. Secondary prompting resulted in less diversity with higher conformation to the traditionally expected ethnoracial identity. CONCLUSION: The prevalence of dataset bias and AI's user-dependent learning abilities underscore the importance of human stewardship. The increasing use of AI in generating medical education content, like MCQs, demands vigilant use of such tools to combat the reinforcement of the race-based stereotypes in medicine.