1. 首页
  2. 人工智能
  3. 论文/代码
  4. Recycling sub-optimial Hyperparameter Optimization models to generate efficient

Recycling sub-optimial Hyperparameter Optimization models to generate efficient

上传者: 2021-01-24 05:35:32上传 .PDF文件 561.67 KB 热度 26次

Recycling sub-optimial Hyperparameter Optimization models to generate efficient Ensemble Deep Learning

Ensemble Deep Learning improves accuracy over a single model by combining predictions from multiple models. It has established itself to be the core strategy for tackling the most difficult problems, like winning Kaggle challenges.Due to the lack of consensus to design a successful deep learning ensemble, we introduce Hyperband-Dijkstra, a new workflow that automatically explores neural network designs with Hyperband and efficiently combines them with Dijkstra's algorithm. This workflow has the same training cost than standard Hyperband running except sub-optimal solutions are stored and are candidates to be selected in the ensemble selection step (recycling). Next, to predict on new data, the user gives to Dijkstra the maximum number of models wanted in the ensemble to control the tradeoff between accuracy and inference time. Hyperband is a very efficient algorithm allocating exponentially more resources to the most promising configurations. It is also capable to propose diverse models due to its pure-exploration nature, which allows Dijkstra algorithm with a smart combination of diverse models to achieve a strong variance and bias reduction. The exploding number of possible combinations generated by Hyperband increases the probability that Dijkstra finds an accurate combination which fits the dataset and generalizes on new data. The two experimentation on CIFAR100 and on our unbalanced microfossils dataset show that our new workflow generates an ensemble far more accurate than any other ensemble of any ResNet models from ResNet18 to ResNet152.

回收次优超参数优化模型以生成有效的集成深度学习

Ensemble深度学习通过组合来自多个模型的预测来提高单个模型的准确性。它已成为解决最困难问题(如赢得Kaggle挑战)的核心策略。.. 由于缺乏设计成功的深度学习集成的共识,我们介绍了Hyperband-Dijkstra,这是一种新的工作流程,可以自动探索Hyperband的神经网络设计,并将其与Dijkstra的算法有效地结合在一起。该工作流程的培训成本与标准Hyperband运行的培训成本相同,只是存储了次优解决方案,并且是在整体选择步骤(回收)中选择的候选对象。接下来,为了预测新数据,用户向Dijkstra提供了集成中所需的最大模型数,以控制准确性和推理时间之间的权衡。超带宽是一种非常有效的算法,可以为最有前途的配置分配更多的资源。由于具有纯探索性,它还可以提出多种模型,这使Dijkstra算法与各种模型的智能组合可以实现强大的方差和偏差减少。Hyperband生成的可能组合的数量激增,增加了Dijkstra找到适合数据集并归纳新数据的准确组合的可能性。在CIFAR100和我们不平衡的微化石数据集上进行的两次实验表明,我们的新工作流生成的集合比从ResNet18到ResNet152的任何ResNet模型的任何其他集合都更准确。 (阅读更多)

用户评论