Abstract
AIM: To evaluate the quality of ChatGPT-generated case reports and assess the ability of ChatGPT to peer review medical articles. METHODS: This study was conducted from February to April 2023. First, ChatGPT 3.0 was used to generate 15 case reports, which were then peer-reviewed by expert human reviewers. Second, ChatGPT 4.0 was employed to peer review 15 published short articles. RESULTS: ChatGPT was capable of generating case reports, but these reports exhibited inaccuracies, particularly when it came to referencing. The case reports received mixed ratings from peer reviewers, with 33.3% of professionals recommending rejection. The reports' overall merit score was 4.9±1.8 out of 10. The review capabilities of ChatGPT were weaker than its text generation abilities. The AI as a peer reviewer did not recognize major inconsistencies in articles that had undergone significant content changes. CONCLUSION: While ChatGPT demonstrated proficiency in generating case reports, there were limitations in terms of consistency and accuracy, especially in referencing.