1. 首页
  2. 人工智能
  3. 论文/代码
  4. Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelera

Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelera

上传者: 2021-01-24 07:53:26上传 .PDF文件 536.81 KB 热度 10次

Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelerated Downsampling

Layer-wise learning, as an alternative to global back-propagation, is easy to interpret, analyze, and it is memory efficient. Recent studies demonstrate that layer-wise learning can achieve state-of-the-art performance in image classification on various datasets.However, previous studies of layer-wise learning are limited to networks with simple hierarchical structures, and the performance decreases severely for deeper networks like ResNet. This paper, for the first time, reveals the fundamental reason that impedes the scale-up of layer-wise learning is due to the relatively poor separability of the feature space in shallow layers. This argument is empirically verified by controlling the intensity of the convolution operation in local layers. We discover that the poorly-separable features from shallow layers are mismatched with the strong supervision constraint throughout the entire network, making the layer-wise learning sensitive to network depth. The paper further proposes a downsampling acceleration approach to weaken the poor learning of shallow layers so as to transfer the learning emphasis to deep feature space where the separability matches better with the supervision restraint. Extensive experiments have been conducted to verify the new finding and demonstrate the advantages of the proposed downsampling acceleration in improving the performance of layer-wise learning.

为什么难以进行分层明智的学习以及通过加速下采样实现的可能解决方案

分层学习作为全局反向传播的替代方法,易于解释,分析,并且内存效率高。最近的研究表明,分层学习可以在各种数据集的图像分类中实现最新的性能。.. 但是,以前对分层学习的研究仅限于具有简单分层结构的网络,对于像ResNet这样的更深层网络,其性能会严重下降。本文首次揭示了阻碍分层学习扩展的根本原因是由于浅层中特征空间的可分离性相对较差。通过控制局部层中卷积运算的强度,可以从经验上验证该论点。我们发现,浅层不可分离的特征与整个网络的强监管约束不匹配,从而使得逐层学习对网络深度敏感。本文还提出了一种下采样加速方法,以减弱浅层的不良学习,从而将学习重点转移到可分离性与监督约束更好匹配的深层特征空间。已经进行了广泛的实验,以验证新的发现并证明拟议的降采样加速在提高分层学习性能方面的优势。 (阅读更多)

下载地址
用户评论