SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning
SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning
Backward locking and update locking are well-known sources of inefficiency in backpropagation that prevent from concurrently updating layers. Several works have recently suggested using local error signals to train network blocks asynchronously to overcome these limitations.However, they often require numerous iterations of trial-and-error to find the best configuration for local training, including how to decouple network blocks and which auxiliary networks to use for each block. In this work, we propose a differentiable search algorithm named SEDONA to automate this process. Experimental results show that our algorithm can consistently discover transferable decoupled architectures for VGG and ResNet variants, and significantly outperforms the ones trained with end-to-end backpropagation and other state-of-the-art greedy-leaning methods in CIFAR-10, Tiny-ImageNet and ImageNet. Thanks to improved parallelism by local training, we also report up to 2.02× speedup over backpropagation in total training time.
SEDONA:搜索去耦神经网络以实现贪婪的逐块学习
向后锁定和更新锁定是众所周知的反向传播效率低下的源,它阻止了同时更新层。最近有一些工作建议使用本地错误信号异步地训练网络块,以克服这些限制。.. 但是,他们经常需要反复尝试才能找到最佳的本地培训配置,包括如何解耦网络模块以及每个模块要使用哪些辅助网络。在这项工作中,我们提出了一种名为SEDONA的可区分搜索算法,以使该过程自动化。实验结果表明,我们的算法能够始终如一地发现VGG和ResNet变体的可传递解耦架构,并且大大优于在Tiny的CIFAR-10中使用端到端反向传播和其他最新贪婪方法训练的架构。 -ImageNet和ImageNet。由于通过局部训练提高了并行度,我们还报告了总训练时间中反向传播的速度提高了2.02倍。 (阅读更多)