Brain-imager: a multimodal framework for image reconstruction and captioning from human brain activity

脑成像仪：一个用于从人脑活动中重建图像和描述信息的多模态框架

阅读：1

期刊：	Brain Informatics	影响因子：	4.500
时间：	2025	起止号：	2025 Nov 27;12(1):32
doi：	10.1186/s40708-025-00282-x	研究方向：	神经科学

Abstract

OBJECTIVE: The reconstruction of visual stimuli and captions from brain activity offers a distinctive viewpoint on how perception reconstructs the external world within neural dynamics. Despite considerable advancements in deep generative models in recent years, simultaneously generating images and captions with both detailed accuracy and semantic consistency remains a significant challenge. METHODS: We introduce panoptic segmentation and generative semantics for the first time, offering enhanced, multi-level data support and a novel perspective in the domain of brain decoding. Using multi-scale fusion techniques, we integrate pixel features from natural images with structural features from panoptic segmentation, creating a state-of-the-art "initial guess." Building upon the neural paradigm that we discovered, we propose an innovative semantic connection strategy to guide image reconstruction. Additionally, by fine-tuning visual semantics within the encoded space compressed by a language model and further combining our advanced retrieval module with the comprehension capabilities of large language models (LLMs), we generate high-quality brain captions. RESULTS: Experimental results demonstrate that we surpass current methods in visual decoding and brain captioning tasks. We offer a webpage to showcase the results: www.neuai4science.cn:5001/brain_visual_decode . CONCLUSION: Our proposed Brain-Imager framework, which incorporates multi-level data and semantic guidance, sets a new standard in the domain. SIGNIFICANCE: This work provides a novel perspective on the relationship between text and image semantics and the visual pathways of the human brain, with potential applications in downstream tasks such as brain-computer interfaces. Additionally, our code is publicly available at https://github.com/songqianyi01/Brain-Imager .

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。