1. 首页
  2. 人工智能
  3. 论文/代码
  4. ACDC: A Structured Efficient Linear Layer

ACDC: A Structured Efficient Linear Layer

上传者: 2021-01-24 05:13:55上传 .PDF文件 380.71 KB 热度 16次

ACDC: A Structured Efficient Linear Layer

The linear layer is one of the most pervasive modules in deep learningrepresentations. However, it requires $O(N^2)$ parameters and $O(N^2)$operations.These costs can be prohibitive in mobile applications or preventscaling in many domains. Here, we introduce a deep, differentiable,fully-connected neural network module composed of diagonal matrices ofparameters, $\mathbf{A}$ and $\mathbf{D}$, and the discrete cosine transform$\mathbf{C}$. The core module, structured as $\mathbf{ACDC^{-1}}$, has $O(N)$parameters and incurs $O(N log N )$ operations. We present theoretical resultsshowing how deep cascades of ACDC layers approximate linear layers. ACDC is,however, a stand-alone module and can be used in combination with any othertypes of module. In our experiments, we show that it can indeed be successfullyinterleaved with ReLU modules in convolutional neural networks for imagerecognition. Our experiments also study critical factors in the training ofthese structured modules, including initialization and depth. Finally, thispaper also provides a connection between structured linear transforms used indeep learning and the field of Fourier optics, illustrating how ACDC could inprinciple be implemented with lenses and diffractive elements.

ACDC:结构化的高效线性层

线性层是深度学习表示中最普遍的模块之一。但是,这需要 Ø(ñ2) 参数和 Ø(ñ2) 操作。.. 这些成本在移动应用程序中可能是禁止的,或者在许多领域中无法扩展。在这里,我们介绍一个由参数对角矩阵组成的深层,可微分,完全连接的神经网络模块, 一种 和 d ,以及离散余弦变换 C 。核心模块,结构为 一种CdC-1个 , 具有 Ø(ñ) 参数并产生 Ø(ñ升ØGñ) 操作。我们提供的理论结果表明,ACDC层的级联深度如何近似线性层。但是,ACDC是独立模块,可以与任何其他类型的模块结合使用。在我们的实验中,我们证明了它确实可以与卷积神经网络中的ReLU模块成功交织,以进行图像识别。我们的实验还研究了这些结构化模块训练中的关键因素,包括初始化和深度。最后,本文还提供了深度学习中使用的结构化线性变换与傅立叶光学领域之间的联系,阐明了原理上如何使用透镜和衍射元件实现ACDC。 (阅读更多)

下载地址
用户评论