1. 首页
  2. 人工智能
  3. 论文/代码
  4. Convolution, attention and structure embedding

Convolution, attention and structure embedding

上传者: 2021-01-24 04:59:29上传 .PDF文件 352.40 KB 热度 27次

Convolution, attention and structure embedding

Deep neural networks are composed of layers of parametrised linear operations intertwined with non linear activations. In basic models, such as the multi-layer perceptron, a linear layer operates on a simple input vector embedding of the instance being processed, and produces an output vector embedding by straight multiplication by a matrix parameter.In more complex models, the input and output are structured and their embeddings are higher order tensors. The parameter of each linear operation must then be controlled so as not to explode with the complexity of the structures involved. This is essentially the role of convolution models, which exist in many flavours dependent on the type of structure they deal with (grids, networks, time series etc.). We present here a unified framework which aims at capturing the essence of these diverse models, allowing a systematic analysis of their properties and their mutual enrichment. We also show that attention models naturally fit in the same framework: attention is convolution in which the structure itself is adaptive, and learnt, instead of being given a priori.

卷积,注意力和结构嵌入

深度神经网络由与非线性激活交织在一起的参数化线性运算层组成。在诸如多层感知器的基本模型中,线性层对要处理的实例的简单输入向量嵌入进行操作,并通过直接乘以矩阵参数来生成输出向量嵌入。.. 在更复杂的模型中,输入和输出是结构化的,并且它们的嵌入是高阶张量。然后必须控制每个线性操作的参数,以免因涉及的结构的复杂性而爆炸。这本质上是卷积模型的作用,卷积模型存在多种形式,取决于它们处理的结构类型(网格,网络,时间序列等)。我们在这里提出一个统一的框架,旨在捕获这些不同模型的本质,从而可以对它们的属性和它们的相互补充进行系统的分析。我们还表明,注意力模型自然适合于同一框架:注意力是卷积,其中结构本身是自适应的并且是学习的,而不是先验的。 (阅读更多)

下载地址
用户评论