Abstract
Adaptivity is a key component of social human-robot interaction (HRI) towards achieving more natural and human-like interactions. Current interactive systems tend to rely on preset and repetitive verbal communication and isolated nonverbal interactions, which results in unappealing engagement. This study proposes an integrated framework that combines a coordinated nonverbal interaction system based on real-time emotion expression with a fine-tuned large language model-based verbal communication system, resulting in more engaging and context-aware interaction. The design utilises the MiRo-E as the zoomorphic social interaction platform, with the aim of enhancing the consistency across verbal and nonverbal modalities and improving user engagement through adaptive and emotionally aligned responses. To evaluate the effectiveness of the approach, a user study was conducted with tasks designed to assess user engagement, task performance, and the perceived naturalness of interaction. Task performance metrics and subjective questionnaire responses indicate that the framework significantly enhances user experience, improving task completion rates, engagement, and perceived naturalness.