Abstract
Multiple-choice questions constitute a critical format for assessing language application proficiency in standardized English tests, such as BEC and TOEIC. Developing explanatory content for such materials traditionally relies heavily on manual labor in test item analysis, which is labor-intensive and time-consuming. Consequently, Artificial Intelligence (AI) approaches centered on Machine Reading Comprehension for Multiple Choice (MCRC) are becoming the preferred solution for generating auxiliary educational content. This task demands models capable of profoundly understanding textual semantics and accurately identifying complex relationship patterns between passages, questions, and answer options. Although Pre-trained Language Models (PLMs) have achieved remarkable success on MCRC tasks, existing methods confront two primary limitations: (1) they remain susceptible to misclassifying highly textually similar yet semantically distant distractor options (e.g. synonymous business terms); (2) they exhibit significantly diminished accuracy when tackling questions requiring indirect reasoning or background knowledge to identify implicit answers. To address these challenges, this paper proposes Contrastive Learning-driven Hierarchical Attention Model for Multiple Choice (CL-HAMC). The proposed model innovatively employs multi-head attention mechanisms to hierarchically model the triple interactions among passages, questions, and options, simulating the progressive, multi-layered reasoning process humans undertake during problem-solving. Furthermore, it incorporates a contrastive learning strategy to sharpen the model's ability to discern nuanced semantic distinctions among answer choices. Extensive experiments on the RACE, RACE-M, and RACE-H benchmarks demonstrate that CL-HAMC achieves substantial and consistent performance gains, establishing a new state-of-the-art (SOTA) on all three datasets. Moreover, CL-HAMC exhibits competitive results on the DREAM dataset. This study provides an effective solution towards the automated processing of highly distractor-rich multiple-choice questions within the English auxiliary learning domain.