Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization
Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization
Learning discriminative and invariant feature representation is the key to visual image categorization. In this article, we propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.We consider transforming the input image according to a finite transformation group that consists of multiple confounding orthogonal matrices, such as the D4 group. Then, we adopt a Siamese-style network to transfer the group structure to the representation space, where we can derive a trivial representation that is invariant under the group action. The linear classifier trained with trivial representation will also be possessed with invariance. To further improve the discriminative power of representation, we extend the representation to the tensor space while imposing orthogonal constraints on the transformation matrix to effectively reduce feature dimensions. We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods. In particular, with using ResNet architecture, our IDCCP model can reduce the dimension of the tensor representation by about 98% without sacrificing accuracy (i.e., <0.5%).
空中场景分类的不变深度可压缩协方差池
学习判别和不变特征表示法是视觉图像分类的关键。在本文中,我们提出了一种新颖的不变深度可压缩协方差合并(IDCCP),以解决空中场景分类中的烦人变化。.. 我们考虑根据由多个混杂正交矩阵组成的有限变换组(例如D4组)对输入图像进行变换。然后,我们采用暹罗风格的网络将组结构转移到表示空间,在此我们可以导出在组动作下不变的琐碎表示。用平凡表示训练的线性分类器也将具有不变性。为了进一步提高表示的判别力,我们将表示扩展到张量空间,同时在变换矩阵上施加正交约束以有效减小特征尺寸。我们对公开发布的空中场景图像数据集进行了广泛的实验,并证明了该方法与最新方法相比的优越性。特别是, (阅读更多)