Nonlinear multi-head cross-attention network and programmable gradient information for gaze estimation

用于注视估计的非线性多头交叉注意力网络和可编程梯度信息

阅读:1

Abstract

Gaze estimation is an important indicator of human behavior that can be used for human assistance. Recent gaze estimation methods are primarily based on convolutional neural networks (CNNs) or attention Transformers. However, CNNs extract a limited local context while losing important global information, whereas attention mechanisms exhibit low utilization of multiscale hybrid features. To address these issues, we propose a novel nonlinear multi-head cross-attention network with programmable gradient information (MCA-PGI), which synthesizes the advantages of CNNs and the Transformer. The programmable gradient information is used to achieve reliable gradient propagation. An auxiliary branch is incorporated to integrate the gradient information, thereby retaining more original information than CNNs. In addition, nonlinear multi-head cross-attention is employed to fuse the global visual and multiscale hybrid features for more accurate gaze estimation. Experimental results on three publicly available datasets demonstrate that the proposed MCA-PGI exhibits strong competitiveness and outperforms most state-of-the-art methods, achieving 2.5% and 10.2% performance improvements on the MPIIFaceGaze and Eyediap datasets, respectively. The implementation code can be found at https://github.com/Yuhang-Hong/MCA-PGI .

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。