1. 首页
  2. 人工智能
  3. 论文/代码
  4. Delay Differential Neural Networks

Delay Differential Neural Networks

上传者: 2021-01-24 06:12:36上传 .PDF文件 1.51 MB 热度 21次

Delay Differential Neural Networks

Neural ordinary differential equations (NODEs) treat computation of intermediate feature vectors as trajectories of ordinary differential equation parameterized by a neural network. In this paper, we propose a novel model, delay differential neural networks (DDNN), inspired by delay differential equations (DDEs).The proposed model considers the derivative of the hidden feature vector as a function of the current feature vector and past feature vectors (history). The function is modelled as a neural network and consequently, it leads to continuous depth alternatives to many recent ResNet variants. We propose two different DDNN architectures, depending on the way current and past feature vectors are considered. For training DDNNs, we provide a memory-efficient adjoint method for computing gradients and back-propagate through the network. DDNN improves the data efficiency of NODE by further reducing the number of parameters without affecting the generalization performance. Experiments conducted on synthetic and real-world image classification datasets such as Cifar10 and Cifar100 show the effectiveness of the proposed models.

时滞微分神经网络

神经常微分方程(NODE)将中间特征向量的计算视为由神经网络参数化的常微分方程的轨迹。在本文中,我们根据延迟微分方程(DDE)提出了一种新颖的模型,延迟微分神经网络(DDNN)。.. 提出的模型将隐藏特征向量的导数视为当前特征向量和过去特征向量(历史)的函数。该函数被建模为神经网络,因此,它导致了许多最近的ResNet变体的连续深度替代。根据当前和过去特征向量的考虑方式,我们提出了两种不同的DDNN体系结构。为了训练DDNN,我们提供了一种内存有效的伴随方法,用于计算梯度并通过网络反向传播。DDNN通过进一步减少参数数量而不影响泛化性能来提高NODE的数据效率。在合成图像和现实世界图像分类数据集(例如Cifar10和Cifar100)上进行的实验证明了所提出模型的有效性。 (阅读更多)

下载地址
用户评论