Performance of ChatGPT on optometry and vision science exam questions

ChatGPT 在验光和视觉科学考试题上的表现

阅读:1

Abstract

The rapid proliferation of Large Language Models (LLM) tools, such as ChatGPT developed by OpenAI, presents both a challenge and an opportunity for educators. While LLMs can generate convincing written responses across a wide range of academic fields, their capabilities vary noticeably across different models, fields and even sub-fields. This paper aims to evaluate the capabilities of LLMs in the field of optometry and vision science by analysing the quality of the responses generated by ChatGPT using sample long answer questions covering different sub-fields of optometry, namely binocular vision, clinical communication, dispensing and ocular pathology. It also seeks to explore the possibility of LLMs being used as virtual graders. The capabilities of ChatGPT were explored utilising various GPT models (GPT-3.5, GPT-4 and o1 models, from oldest to newest) by investigating the concordance between ChatGPT and a human grader. This was followed by benchmarking the performance of these GPT models to various sample questions in optometry and vision science. Statistical analyses include mixed-effect analysis and the Friedman test, Wilcoxon signed-rank test and thematic analysis. ChatGPT graders awarded higher marks compared to human graders, but significant only for GPT-3.5 (p < 0.05). Benchmarking on sample questions demonstrated that all GPT models can generate satisfactory responses above the 50% 'pass' score in many cases (p < 0.05), albeit with the performance varying significantly across different sub-fields (p < 0.0001) and models (p = 0.0003). Newer models significantly outperformed older models in most cases. The frequency of thematic response errors was more mixed between GPT-3.5 and GPT-4 models (p < 0.05 to p > 0.99), while o1 made no thematic errors. These findings indicate ChatGPT may impact learning and teaching practices in this field. The inconsistent performances across sub-fields and additional implementation considerations, such as ethics and transparency, support a judicious adaptation of assessment practice and adoption of the technology in optometry and vision science education.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。