1. 首页
  2. 人工智能
  3. 论文/代码
  4. D3Net: Densely connected multidilated DenseNet for music source separation

D3Net: Densely connected multidilated DenseNet for music source separation

上传者: 2021-01-24 08:14:04上传 .PDF文件 303.77 KB 热度 198次

D3Net: Densely connected multidilated DenseNet for music source separation

Music source separation involves a large input field to model a long-term dependence of an audio signal. Previous convolutional neural network (CNN) -based approaches address the large input field modeling using sequentially down- and up-sampling feature maps or dilated convolution.In this paper, we claim the importance of a rapid growth of a receptive field and a simultaneous modeling of multi-resolution data in a single convolution layer, and propose a novel CNN architecture called densely connected dilated DenseNet (D3Net). D3Net involves a novel multi-dilated convolution that has different dilation factors in a single layer to model different resolutions simultaneously. By combining the multi-dilated convolution with DenseNet architecture, D3Net avoids the aliasing problem that exists when we naively incorporate the dilated convolution in DenseNet. Experimental results on MUSDB18 dataset show that D3Net achieves state-of-the-art performance with an average signal to distortion ratio (SDR) of 6.01 dB.

D3Net:密集连接的多重DenseNet,用于分离音乐源

音乐源分离涉及一个较大的输入字段,以对音频信号的长期依赖性进行建模。先前的基于卷积神经网络(CNN)的方法使用顺序向下和向上采样的特征图或膨胀卷积来解决大型输入场建模。.. 在本文中,我们声称在单个卷积层中快速增加接收场和同时建模多分辨率数据的重要性,并提出了一种新颖的CNN体系结构,称为密集连接扩张DenseNet(D3Net)。D3Net涉及一种新颖的多层卷积,该多层卷积在单个层中具有不同的膨胀因子,以同时对不同的分辨率进行建模。通过将多重卷积与DenseNet体系结构相结合,D3Net避免了当我们天真地将扩张卷积合并到DenseNet中时出现的混叠问题。在MUSDB18数据集上的实验结果表明,D3Net以6.01 dB的平均信噪比(SDR)实现了最先进的性能。 (阅读更多)

用户评论