FixNorm：剖析体重衰减，以训练深度神经网络

上传者：qqvisual75478 2021-01-22 03:37:48上传 .PDF文件 712.25 KB 热度 24次

权重衰减是训练深度神经网络（DNN）的一种广泛使用的技术。它极大地影响了泛化性能，但是其底层机制尚未完全被理解。..

FixNorm: Dissecting Weight Decay for Training Deep Neural Networks

Weight decay is a widely used technique for training Deep Neural Networks(DNN). It greatly affects generalization performance, but the underlying mechanisms are not fully understood.Recent works show that for layers followed by normalizations, weight decay mainly affects the \emph{effective learning rate}. However, although normalizations have been extensively adopted in modern DNNs, layers such as the final fully-connected layer do not satisfy this precondition. For these layers, the effects of weight decay are still unclear. In this paper, we comprehensively investigate the mechanisms of weight decay and find that except for influencing effective learning rate, weight decay has another distinct mechanism that is equally important: affecting generalization performance by controlling \emph{cross-boundary risk}. These two mechanisms together give a more comprehensive explanation for the effects of weight decay. Based on this discovery, we propose a new training method called \textbf{FixNorm}, which discards weight decay and directly controls the two mechanisms. We also propose a practical method to tune hyperparameters of FixNorm, finding near-optimal solutions 2$\sim$3 times faster than Bayesian Optimization. On ImageNet classification task, training EfficientNet-B0 with FixNorm achieves 77.7\%, which outperforms the original baseline by a clear margin. Surprisingly, when scaling MobileNetV2 to the same FLOPS and applying the same tricks with EfficientNet-B0, training with FixNorm achieves 77.4\%, which shows the importance of well-tuned training procedures and further verifies the effectiveness of our approach. We set up more well-tuned baselines using FixNorm, to facilitate fair comparisons in the community.

下载地址

用户评论

更多下载

下载地址

立即下载

用户评论

FixNorm剖析体重衰减以训练深度神经网络

Weight decay is a widely used technique for traini...

大小：712.25 KB | 2021-01-22 03:37:48
使用离散状态转换训练深度神经网络

深度神经网络已经在各种人工智能任务中实现了迅猛的突破,但是,由于消耗了无法忍受的硬件资源,训练时间和...

大小：128KB | 2021-04-16 18:02:37
深度神经网络ssd检测类深度神经网络

大小：0B | 2019-01-07 19:11:45
深度神经网络

基于深度卷积神经网络的超分辨率技术的VDCN，其中包含代码

大小：0B | 2019-09-12 01:25:57
BP神经网络训练

大小：0B | 2019-01-21 07:44:15
javaCV神经网络训练

大小：0B | 2019-01-11 01:59:07
cpp DeepCLOpenCL库用于训练深度卷积神经网络

DeepCL - OpenCL库用于训练深度卷积神经网络

大小：1.12MB | 2020-07-25 09:41:47
深度神经网络94.5

对模型的参数进一步调整......,有一个奇怪的地方,batch_size居然影响到了泛化能力,不过...

大小：39.63MB | 2020-08-31 00:47:21
深度神经网络调研

大小：0B | 2019-04-05 01:41:09
深度神经网络综述

大小：0B | 2018-12-31 15:55:40
深度脉冲神经网络

大小：0B | 2019-01-18 23:39:34
13深度神经网络

大小：0B | 2019-01-22 10:18:29
深度神经网络构建

大小：0B | 2019-01-22 10:19:00
深度学习_神经网络

deeplearning深度学习文档，一些深度学习资源，一些深度学习资源。

大小：0B | 2019-05-15 06:18:15
深度学习神经网络

深度学习神经网络是一种通过类似人脑神经元的结构来处理信息和学习的机器学习算法。该算法可应用于图像和语...

大小：3.38MB | 2023-04-26 16:07:56
深度神经网络DNN

个人从网络收集资料，本资料共分为九个部分介绍深度神经网络

大小：0B | 2019-09-07 01:18:42