On Filter Generalization for Music Bandwidth Extension Using Deep Neural Network
On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks
In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output.Our main contribution centers on the impact of the choice of low pass filter when training and subsequently testing the network. For two different state of the art deep architectures, ResNet and U-Net, we demonstrate that when the training and testing filters are matched, improvements in signal-to-noise ratio (SNR) of up to 7dB can be obtained. However, when these filters differ, the improvement falls considerably and under some training conditions results in a lower SNR than the band-limited input. To circumvent this apparent overfitting to filter shape, we propose a data augmentation strategy which utilizes multiple low pass filters during training and leads to improved generalization to unseen filtering conditions at test time.
基于深度神经网络的音乐带宽扩展的滤波器归纳
在本文中,我们讨论了音频增强的广泛领域的一个子主题,即音乐音频带宽扩展。我们使用深度神经网络来公式化带宽扩展问题,其中将带宽受限的信号作为网络的输入提供,目的是重建全带宽输出。.. 我们的主要贡献集中在培训和随后测试网络时选择低通滤波器的影响。对于两种不同的最新深度架构,ResNet和U-Net,我们证明了当训练和测试滤波器匹配时,可以获得高达7dB的信噪比(SNR)改善。但是,当这些滤波器不同时,改进效果会大大下降,并且在某些训练条件下,其SNR会低于带限输入。为了避免这种明显的过拟合滤波器形状,我们提出了一种数据增强策略,该策略在训练过程中使用多个低通滤波器,并在测试时改进了对看不见的滤波条件的泛化能力。 (阅读更多)