根本使连贯一致：测量梯度对齐的演变

上传者：qqvisual75478 2021-01-22 04:06:29上传 .PDF文件 1.19 MB 热度 9次

我们提出了一个新的指标（米 -相干性）以实验性地研究训练过程中每个示例梯度的对齐方式。直观地给出样本大小米，米 -相干性是样本中平均可从任何一个样本的梯度沿一小步获益的样本数。..

Making Coherence Out of Nothing At All: Measuring Evolution of Gradient Alignment

We propose a new metric ($m$-coherence) to experimentally study the alignment of per-example gradients during training. Intuitively, given a sample of size $m$, $m$-coherence is the number of examples in the sample that benefit from a small step along the gradient of any one example on average.We show that compared to other commonly used metrics, $m$-coherence is more interpretable, cheaper to compute ($O(m)$ instead of $O(m^2)$) and mathematically cleaner. (We note that $m$-coherence is closely connected to gradient diversity, a quantity previously used in some theoretical bounds.) Using $m$-coherence, we study the evolution of alignment of per-example gradients in ResNet and EfficientNet models on ImageNet and several variants with label noise, particularly from the perspective of the recently proposed Coherent Gradients (CG) theory that provides a simple, unified explanation for memorization and generalization [Chatterjee, ICLR 20]. Although we have several interesting takeaways, our most surprising result concerns memorization. Naively, one might expect that when training with completely random labels, each example is fitted independently, and so $m$-coherence should be close to 1. However, this is not the case: $m$-coherence reaches moderately high values during training (though still much smaller than real labels), indicating that over-parameterized neural networks find common patterns even in scenarios where generalization is not possible. A detailed analysis of this phenomenon provides both a deeper confirmation of CG, but at the same point puts into sharp relief what is missing from the theory in order to provide a complete explanation of generalization in neural networks.

下载地址

用户评论

更多下载

下载地址

立即下载

用户评论

根本使连贯一致测量梯度对齐的演变

We propose a new metric ($m$-coherence) to experim...

大小：1.19 MB | 2021-01-22 04:06:29
一致Go的一致哈希包源码

持续的 Go的一致哈希包。安装 go get stathat.com/c/consistent 例...

大小：7KB | 2021-02-28 19:24:20
VTP一致的意思

主要用于管理在同一个域的网络范围内VLANs的建立、删除和重命名。在一台VTP Server 上配置...

大小：26KB | 2020-09-20 14:26:03
一致代价搜索

针对给定的路线图，实现一致代价搜索的图搜索算法并记录搜索路径

大小：0B | 2019-05-22 16:03:52
一致性与平均一致性

一致性的基本框架，可以为设计一致性协议提供基本的条件

大小：0B | 2019-05-13 13:09:05
全息何时一致

全息对偶关系涉及两种根本不同的理论：一种具有引力，一种没有引力。这种等效性的存在强加了强一致性条件...

大小：462KB | 2020-07-16 07:10:03
一致哈希算法

一致哈希算法属于哈希算法的一种。它是对哈希算法的完善。此文件是一致性哈希算法的实现代码，完成的是一些...

大小：0B | 2018-12-21 11:33:13
CSS文本域和按钮对齐不一致解决方案

网页中form表单的元素会出现一些问题:比如文本域和按钮对齐问题,这个问题会影响到界面的美观度,本文...

大小：32KB | 2020-09-29 04:44:43
使图像对齐

大小：0B | 2019-02-10 22:29:55
一致与共识ppt

介绍区块链中的共识机制。私有链（paxos,raft）;联盟链（pbft）;公有链（pow）

大小：0B | 2020-05-27 15:29:37
消息一致性

消息一致性,发送异常后如何处理,java实现。系列资源分享

大小：815KB | 2020-08-20 14:31:15
cache一致性和cache不一致的解决方法

大小：0B | 2018-12-08 17:41:35
一致性代码

多智能体一致性算法的matlab程序,初学者可以看看。

大小：1KB | 2020-09-27 19:13:55
session一致性

Session consistency

大小：0B | 2019-06-26 02:43:36
一致性算法

多智能体一致性算法的仿真实验源程序，仅供研究此方向的同行研究

大小：0B | 2019-08-17 20:07:49
一致性哈希

/***分布式缓存部署方案*当有1台cache服务器不能满足我们的需求，我们需要布置多台来做分布式服...

大小：0B | 2019-07-05 05:15:01