Analogue speech recognition based on physical computing

基于物理计算的模拟语音识别

阅读:2

Abstract

With the rise of decentralized computing, such as in the Internet of Things, autonomous driving and personalized healthcare, it is increasingly important to process time-dependent signals 'at the edge' efficiently: right at the place where the temporal data are collected, avoiding time-consuming, insecure and costly communication with a centralized computing facility (or 'cloud'). However, modern-day processors often cannot meet the restrained power and time budgets of edge systems because of intrinsic limitations imposed by their architecture (von Neumann bottleneck) or domain conversions (analogue to digital and time to frequency). Here we propose an edge temporal-signal processor based on two in-materia computing systems for both feature extraction and classification, reaching near-software accuracy for the TI-46-Word(1) and Google Speech Commands(2) datasets. First, a nonlinear, room-temperature reconfigurable-nonlinear-processing-unit(3,4) layer realizes analogue, time-domain feature extraction from the raw audio signals, similar to the human cochlea. Second, an analogue in-memory computing chip(5), consisting of memristive crossbar arrays, implements a compact neural network trained on the extracted features for classification. With submillisecond latency, reconfigurable-nonlinear-processing-unit-based feature extraction consuming roughly 300 nJ per inference, and the analogue in-memory computing-based classifier using around 78 µJ (with potential for roughly 10 µJ)(6), our findings offer a promising avenue for advancing the compactness, efficiency and performance of heterogeneous smart edge processors through in materia computing hardware.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。