Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelera

上传者：apron_98243 2021-01-24 07:53:26上传 .PDF文件 536.81 KB 热度 14次

Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelerated Downsampling

Layer-wise learning, as an alternative to global back-propagation, is easy to interpret, analyze, and it is memory efficient. Recent studies demonstrate that layer-wise learning can achieve state-of-the-art performance in image classification on various datasets.However, previous studies of layer-wise learning are limited to networks with simple hierarchical structures, and the performance decreases severely for deeper networks like ResNet. This paper, for the first time, reveals the fundamental reason that impedes the scale-up of layer-wise learning is due to the relatively poor separability of the feature space in shallow layers. This argument is empirically verified by controlling the intensity of the convolution operation in local layers. We discover that the poorly-separable features from shallow layers are mismatched with the strong supervision constraint throughout the entire network, making the layer-wise learning sensitive to network depth. The paper further proposes a downsampling acceleration approach to weaken the poor learning of shallow layers so as to transfer the learning emphasis to deep feature space where the separability matches better with the supervision restraint. Extensive experiments have been conducted to verify the new finding and demonstrate the advantages of the proposed downsampling acceleration in improving the performance of layer-wise learning.

为什么难以进行分层明智的学习以及通过加速下采样实现的可能解决方案

分层学习作为全局反向传播的替代方法，易于解释，分析，并且内存效率高。最近的研究表明，分层学习可以在各种数据集的图像分类中实现最新的性能。.. 但是，以前对分层学习的研究仅限于具有简单分层结构的网络，对于像ResNet这样的更深层网络，其性能会严重下降。本文首次揭示了阻碍分层学习扩展的根本原因是由于浅层中特征空间的可分离性相对较差。通过控制局部层中卷积运算的强度，可以从经验上验证该论点。我们发现，浅层不可分离的特征与整个网络的强监管约束不匹配，从而使得逐层学习对网络深度敏感。本文还提出了一种下采样加速方法，以减弱浅层的不良学习，从而将学习重点转移到可分离性与监督约束更好匹配的深层特征空间。已经进行了广泛的实验，以验证新的发现并证明拟议的降采样加速在提高分层学习性能方面的优势。（阅读更多）

下载地址

用户评论

更多下载

下载地址

 立即下载

用户评论

发表评论

Why Layer_Wise Learning is Hard to Scale_up and a Possible Solution via Accelera

分层学习作为全局反向传播的替代方法，易于解释，分析，并且内存效率高。最近的研究表明，分层学习可以在各...

大小：536.81 KB | 2021-01-24 07:53:26

Making super large_scale machine learning possible

MSRA刘铁岩博士讲BigMachineLearning，从算法创新和系统创新两方面入手，还介绍了从...

大小：0B | 2019-05-28 09:53:19

模型剪枝学习笔记4–Layer wise Pruning and Auto tuning of Layer wise Learning Rates

Layer-wise Pruning and Auto-tuning of Layer-wise L...

大小：38KB | 2021-01-16 16:25:02

Layer_wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

条件分析通过探索曲率矩阵的频谱来发现优化目标的前景。理论上已经对线性模型进行了很好的探索。.. 我们...

大小：5.27 MB | 2021-01-24 04:34:20

Near optimal solution to pair wise LSB matching via an immune programming strate

Near-optimal solution to pair-wise LSB matching vi...

大小：576KB | 2021-02-27 00:01:40

Learning.Perl.Making.Easy.Things.Easy.and.Hard.Things.Possible.7th.Ed

Ifyou’rejustgettingstartedwithPerl,thisisthebookyo...

大小：0B | 2020-05-02 22:11:19

Optimizing ranking for response prediction via triplet wise learning from histor

In the real-time bidding (RTB) display advertising...

大小：1024KB | 2021-02-17 03:07:02

Image denoising via2D dictionary learning and adaptive hard thresholding

Image denoising via 2D dictionary learning and ada...

大小：896KB | 2021-02-24 05:07:30

Layer stack up.

PCB的层叠设计，对于不同层数的PCB电路板的不同方案。

大小：0B | 2019-08-02 07:50:14

WebSphere eXtreme Scale Solution Architecture

IBM为解决应用程序架构的可伸缩性的产品

大小：0B | 2019-06-01 12:15:03

learning python the hard way

learning python the hardway,英文版的,直接介绍python的应用实例,很...

大小：598KB | 2020-09-30 00:27:28

Greedy Layer_Wise Training of Deep Networks

GreedyLayer-WiseTrainingofDeepNetworks

大小：0B | 2019-05-31 08:40:11

Large Scale Learning to Rank

大小：0B | 2018-12-09 04:23:28

Why Machine Learning Works A Search for Simplicity

WhyMachineLearningWorks:ASearch(forSimplicity)

大小：0B | 2019-06-05 15:27:22

large scale machine learning with python

大小：0B | 2018-12-09 04:23:24

Large Scale Multiple Kernel Learning

大小：0B | 2018-12-09 04:23:22