Multi-modal music techniques for synthesizing high-quality audio waveforms from MIDI data

用于从 MIDI 数据合成高质量音频波形的多模态音乐技术

阅读:1

Abstract

A highly effective music synthesizer should deliver high-fidelity audio for a mix of instruments and voices. Current synthesizers often need to choose between specialized models that provide detailed control over specific instruments and flexible waveform models that accommodate a variety of music at the expense of precision. To transcend the existing limitations, this paper introduces MIAO, an avant-garde neural music synthesizer that revolutionizes the domain of interactive and expressive music synthesis by converting MIDI sequences into rich, dynamic audio outputs. Specifically, MIAO can be cultivated through training on diverse transcription datasets that correlate MIDI with audio, thereby deepening its comprehension of MIDI intricacies and elevating its capacity for robust representation learning. This approach allows MIAO to offer precise note-level control over composition and instrumentation, effectively handling a wide spectrum of instruments. We evaluate MIAO's performance by benchmarking it against six datasets: MAESTROv3 (piano), Slakh2100 (synthetic multi-instrument), Cerberus4 (synthetic multi-instrument), Guitarset (guitar), MusicNet (orchestral multi-instrument), and URMP (orchestral multi-instrument), where it sets new performance benchmarks.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。