Reconstructing music perception from brain activity using a prior guided diffusion model

利用先验引导扩散模型从大脑活动中重建音乐感知

阅读：1

作者：Ciferri,Matteo,Ferrante,Matteo,Toschi,Nicola

期刊：	Scientific Reports	影响因子：	3.900
时间：	2025	起止号：	2025 Nov 26;15(1):42108
doi：	10.1038/s41598-025-26095-w	研究方向：	神经科学

Abstract

Reconstructing music directly from brain activity provides insight into the neural representations underlying auditory processing and paves the way for future brain-computer interfaces. We introduce a fully data-driven pipeline that combines cross-subject functional alignment with bayesian decoding in the latent space of a diffusion-based audio generator. Functional alignment projects individual fMRI responses onto a shared representational manifold, increasing the performance of cross-participant accuracy with respect to anatomically normalized baselines. A bayesian search over latent trajectories then selects the most plausible waveform candidate, stabilizing reconstructions against neural noise. Crucially, we bridge CLAP's multi-modal embeddings to music-domain latents through a dedicated aligner, eliminating the need for hand-crafted captions and preserving the intrinsic structure of musical features. Evaluated on ten diverse genres, the model achieves a cross-subject-averaged identification accuracy of [Formula: see text], and produces audio that human listeners recognize above chance in 85.7% of trials. Voxel-wise analyses locate the predictive signal within a bilateral circuit spanning early auditory, inferior-frontal, and premotor cortices, consistent with hierarchical and sensorimotor theories of music perception. The framework establishes a principled bridge between generative audio models and cognitive neuroscience.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。