A U-Net Based Discriminator for Generative Adversarial Networks
A U-Net Based Discriminator for Generative Adversarial Networks
Among the major remaining challenges for generative adversarial networks (GANs) is the capacity to synthesize globally and locally coherent images with object shapes and textures indistinguishable from real images. To target this issue we propose an alternative U-Net based discriminator architecture, borrowing the insights from the segmentation literature.The proposed U-Net based architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images, by providing the global image feedback as well. Empowered by the per-pixel response of the discriminator, we further propose a per-pixel consistency regularization technique based on the CutMix data augmentation, encouraging the U-Net discriminator to focus more on semantic and structural changes between real and fake images. This improves the U-Net discriminator training, further enhancing the quality of generated samples. The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics, enabling the generator to synthesize images with varying structure, appearance and levels of detail, maintaining global and local realism. Compared to the BigGAN baseline, we achieve an average improvement of 2.7 FID points across FFHQ, CelebA, and the newly introduced COCO-Animals dataset.
生成对抗性网络的基于U-Net的鉴别器
生成对抗网络(GAN)所面临的主要挑战之一是,能否合成具有与真实图像无法区分的物体形状和纹理的全局和局部相干图像。为了解决这个问题,我们提出了一种基于U-Net的鉴别器架构,它借鉴了细分文献的见解。.. 所提出的基于U-Net的体系结构通过提供全局图像反馈,还可以在保持合成图像的全局一致性的同时,向生成器提供详细的每像素反馈。在鉴别器的每个像素响应的支持下,我们进一步提出了一种基于CutMix数据增强的逐像素一致性正则化技术,鼓励U-Net鉴别器更多地关注真实图像与伪图像之间的语义和结构变化。这改善了U-Net鉴别器训练,进一步提高了生成样本的质量。新颖的鉴别器在标准分布和图像质量指标方面改进了现有技术,使生成器能够合成具有变化的结构,外观和详细程度的图像,并保持全局和局部真实感。 (阅读更多)