Abstract
Unlike traditional control systems that rely on physical input devices, facial gesture-based interaction offers a contactless and intuitive method for operating autonomous systems. Recent advances in computer vision and deep learning have enabled the use of facial expressions and movements for command recognition in human-robot interaction. In this work, we propose a lightweight, real-time facial gesture recognition method, GhostNet-BiLSTM-Attention (GBA), which integrates GhostNet and BiLSTM with an attention mechanism, is trained on the FaceGest dataset, and is integrated with a 3D robot simulation in Unity. The system is designed to recognize predefined facial gestures such as head tilts, eye blinks, and mouth movements with high accuracy and low inference latency. Recognized gestures are mapped to specific robot commands and transmitted to a Unity-based simulation environment via socket communication across machines. This framework enables smooth and immersive robot control without the need for conventional controllers or sensors. Real-time evaluation demonstrates the system's robustness and responsiveness under varied user and lighting conditions, achieving a classification accuracy of 99.13% on the FaceGest dataset. The GBA holds strong potential for applications in assistive robotics, contactless teleoperation, and immersive human-robot interfaces.