1. 首页
  2. 人工智能
  3. 论文/代码
  4. DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime

DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime

上传者: 2021-01-24 03:37:35上传 .PDF文件 450.92 KB 热度 13次

DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition

In this work we tackle the challenging problem of anime character recognition. Anime, referring to animation produced within Japan and work derived or inspired from it.For this purpose we present DAF:re (DanbooruAnimeFaces:revamped), a large-scale, crowd-sourced, long-tailed dataset with almost 500 K images spread across more than 3000 classes. Additionally, we conduct experiments on DAF:re and similar datasets using a variety of classification models, including CNN based ResNets and self-attention based Vision Transformer (ViT). Our results give new insights into the generalization and transfer learning properties of ViT models on substantially different domain datasets from those used for the upstream pre-training, including the influence of batch and image size in their training. Additionally, we share our dataset, source-code, pre-trained checkpoints and results, as Animesion, the first end-to-end framework for large-scale anime character recognition: https://github.com/arkel23/animesion

DAF:re:用于动漫人物识别的具有挑战性的,来自人群的大型长尾数据集

在这项工作中,我们解决了动漫人物识别这一具有挑战性的问题。动漫,指的是日本境内制作的动画以及从中衍生或启发的作品。.. 为此,我们提出了DAF:re(DanbooruAnimeFaces:revamped),这是一个大规模的,众包的,长尾的数据集,具有分布在3000多个类别中的近500 K图像。此外,我们使用多种分类模型,包括基于CNN的ResNet和基于自注意的视觉转换器(ViT),对DAF:re和类似数据集进行了实验。我们的结果为ViT模型在与用于上游预训练的域数据集实质上不同的域数据集上的泛化和转移学习属性提供了新的见解,包括批处理和图像大小在其训练中的影响。此外,我们将动漫集作为第一个大规模动漫角色识别的首个端到端框架,与我们共享数据集,源代码,经过预训练的检查点和结果:https://github.com/arkel23/animesion (阅读更多)

用户评论