Obfuscation via pitch-shifting for balancing privacy and diagnostic utility in voice-based cognitive assessment

利用音调变换进行混淆,以平衡基于语音的认知评估中的隐私性和诊断效用

阅读:1

Abstract

INTRODUCTION: Digital voice analysis is an emerging tool for differentiating cognitive states, but it poses privacy risks as automated systems may inadvertently identify speakers. METHODS: We developed a computational framework to evaluate the trade-off between voice obfuscation and cognitive assessment accuracy, using pitch-shifting as a representative method. This framework was applied to voice recordings from the Framingham Heart Study (FHS, n = 128) and the DementiaBank Delaware (DBD, n = 85) corpus, both featuring responses to neuropsychological tests. Speaker obfuscation was measured via equal error rate (EER), and diagnostic utility was assessed through machine learning models distinguishing cognitive states: normal cognition (NC), mild cognitive impairment (MCI), and dementia (DE). RESULTS: With the top 20 acoustic features, our framework achieved classification accuracies of 62.2% (EER: 0.3335) on the FHS dataset for NC, MCI, and DE differentiation, and 63.7% (EER: 0.1796) on the DBD dataset for NC and MCI differentiation, using obfuscated speech files. DISCUSSION: Our results demonstrate the feasibility of privacy-preserving voice markers, offering a scalable solution for voice-based cognitive assessments. HIGHLIGHTS: We developed a computational framework using pitch-shifting and acoustic transformations to balance speaker privacy and diagnostic utility in voice-based cognitive assessments. We evaluated the framework on two independent datasets, Framingham Heart Study (FHS, n = 128) and DementiaBank Delaware (DBD, n = 85) corpus, assessing the trade-off between privacy (measured by equal error rate [EER]) and classification accuracy. Our framework achieved classification accuracies of 62.2% (EER: 0.3335) for distinguishing normal cognition (NC), mild cognitive impairment (MCI), and dementia in the FHS dataset and 63.7% (EER: 0.1796) for NC and MCI differentiation in the DBD dataset, using obfuscated speech files. Our framework demonstrates that pitch-shifting levels can preserve diagnostic utility while protecting speaker identity, offering a scalable and privacy-preserving solution.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。