Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis

利用扩散模型探索基于脑电信号的图像生成潜力:结合混合方法和多模态分析的集成框架

阅读:3

Abstract

BACKGROUND: Electroencephalography (EEG) has been widely used to measure brain activity, but its potential to generate accurate images from neural signals remains a challenge. Most EEG-decoding research has focused on tasks such as motor imagery, emotion recognition, and brain wave classification, which involve EEG signal analysis and classification. Some studies have explored the correlation between EEG and images, primarily focusing on EEG-image pair classification or transformation. However, EEG-based image generation remains underexplored. OBJECTIVE: The primary goal of this study was to extend EEG-based classification to image generation, addressing the limitations of previous methods and unlocking the full potential of EEG for image synthesis. To achieve more meaningful EEG-to-image generation, we developed a novel framework, Neural-Cognitive Multimodal EEG-Informed Image (NECOMIMI), which was specifically designed to generate images directly from EEG signals. METHODS: We developed a 2-stage NECOMIMI method, which integrated the novel Neural Encoding Representation Vectorizer (NERV) EEG encoder that we designed with a diffusion-based generative model. The Category-Based Assessment Table (CAT) score was introduced to evaluate the semantic quality of EEG-generated images. In addition, the ThingsEEG dataset was used to validate and benchmark the CAT score, providing a standardized measure for assessing EEG-to-image generation performance. RESULTS: The NERV EEG encoder achieved state-of-the-art performance in several zero-shot classification tasks, with an average accuracy of 94.8% (SD 1.7%) in the 2-way task and 86.8% (SD 3.4%) in the 4-way task, outperforming models such as Natural Image Contrast EEG, Multimodal Similarity-Keeping Contrastive Learning, and Adaptive Thinking Mapper ShallowNet. This highlighted its superiority as a feature extraction tool for EEG signals. In a 1-stage image generation framework, EEG embeddings often resulted in abstract or generalized images such as landscapes instead of specific objects. Our proposed 2-stage NECOMIMI architecture effectively extracted semantic information from noisy EEG signals, showing its ability to capture and represent underlying concepts derived from brain wave activity. We further conducted a perturbation study to test whether the model overly depended on visual cortex EEG signals for scene-based image generation. The perturbation of visual cortex EEG channels led to a notable increase in Fréchet inception distance scores, suggesting that our model relied heavily on posterior brain signals to generate semantically coherent images. CONCLUSIONS: NECOMIMI demonstrated the potential of EEG-to-image generation, revealing the challenges of translating noisy EEG data into accurate visual representations. The novel NERV EEG encoder for multimodal contrastive learning reached state-of-the-art performance both on n-way zero-shot and EEG-informed image generation. The introduction of the CAT score provided a new evaluation metric, paving the way for future research to refine generative models. In addition, this study highlighted the significant clinical potential of EEG-to-image generation, particularly in enhancing brain-machine interface systems and improving quality of life for individuals with motor impairments.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。