Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processi

上传者：masterpiece_24483 2021-01-24 08:20:26上传 .PDF文件 815.52 KB 热度 20次

Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing

Convolutional Neural Networks with 3D kernels (3D CNNs) currently achieve state-of-the-art results in video recognition tasks due to their supremacy in extracting spatiotemporal features within video frames. There have been many successful 3D CNN architectures surpassing the state-of-the-art results successively.However, nearly all of them are designed to operate offline creating several serious handicaps during online operation. Firstly, conventional 3D CNNs are not dynamic since their output features represent the complete input clip instead of the most recent frame in the clip. Secondly, they are not temporal resolution-preserving due to their inherent temporal downsampling. Lastly, 3D CNNs are constrained to be used with fixed temporal input size limiting their flexibility. In order to address these drawbacks, we propose dissected 3D CNNs, where the intermediate volumes of the network are dissected and propagated over depth (time) dimension for future calculations, substantially reducing the number of computations at online operation. For action classification, the dissected version of ResNet models performs 74-90% fewer computations at online operation while achieving $\sim$5% better classification accuracy on the Kinetics-600 dataset than conventional 3D ResNet models. Moreover, the advantages of dissected 3D CNNs are demonstrated by deploying our approach onto several vision tasks, which consistently improved the performance.

剖析的3D CNN：用于高效在线视频处理的时间跳过连接

具有3D内核（3D CNN）的卷积神经网络由于在提取视频帧中的时空特征方面具有优势，因此目前在视频识别任务中达到了最新水平。已经有许多成功的3D CNN架构相继超过了最新技术成果。.. 但是，几乎所有这些设备都设计为可离线运行，从而在在线运行期间造成一些严重的障碍。首先，传统的3D CNN并不是动态的，因为它们的输出特征代表完整的输入片段，而不是片段中的最新帧。其次，由于其固有的时间下采样，它们不能保持时间分辨率。最后，3D CNN被限制在固定的时间输入大小下使用，从而限制了它们的灵活性。为了解决这些缺点，我们提出了解剖3D CNN，其中网络的中间体积在深度（时间）维度上进行了解剖和传播，以便将来进行计算，从而大大减少了在线操作时的计算数量。对于动作分类，〜与传统的3D ResNet模型相比，Kinetics-600数据集的分类精度提高了5％。此外，通过将我们的方法部署到多个视觉任务上可以证明解剖3D CNN的优势，从而不断提高性能。（阅读更多）

下载地址

用户评论

更多下载

下载地址

 立即下载

用户评论

发表评论

Dissected3D CNNs Temporal Skip Connections for Efficient Online Video Processi

具有3D内核（3D CNN）的卷积神经网络由于在提取视频帧中的时空特征方面具有优势，因此目前在视频识...

大小：815.52 KB | 2021-01-24 08:20:26

Dissected_3D_CNNs

Dissected-3D-CNNs

大小：2.16 MB | 2021-01-24 08:20:32

3D Online3DViewer.zip

3D-Online3DViewer.zip,联机三维模型查看器,3D建模使用专门的软件来创建物理对象...

大小：633KB | 2020-07-17 16:25:02

3D Video communication Algorithms

3DVideocommunicationAlgorithms,conceptsandreal-tim...

大小：0B | 2019-09-18 05:04:41

Efficient Online Segmentation for Sparse3D LaserScans.pdf

Efficient Online Segmentation for Sparse 3D LaserS...

大小：6.97MB | 2020-12-29 17:30:42

3D Computer Vision Efficient Methods and Applications

Thisbookprovidesanintroductiontothefoundationsofth...

大小：0B | 2019-06-05 06:35:49

3D computer vision efficient method and application

3D computer vision efficient method and applicatio...

大小：9.08MB | 2020-09-20 15:20:23

3D Computer vision efficient methods and applications

计算机图形学英文原版书籍。 This work provides an introduction t...

大小：0B | 2018-12-28 13:27:57

3D Human Video Generation.zip

3D-Human-Video-Generation.zip,人类视频生成纸张列表,3D建模使用专门的...

大小：2KB | 2020-07-16 16:32:08

Graph CNNs with Motif and Variable Temporal Block for Skeleton based Action

Graph CNNs with Motif and Variable Temporal Block ...

大小：1.41MB | 2021-02-08 05:11:07

3D displays toward holographic video displays of3D images

As the flat panel displays (Liquid Crystal Display...

大小：301KB | 2021-02-16 20:23:31

Online3DViewer示例：网页展示3D模型

Online3DViewer是一款开源的免费3D web解决方案，它支持多种3D文件格式，截至202...

大小：4.69MB | 2023-11-15 23:01:27

3D视频编码标准Multiview Video Coding

3D视频编码标准(MultiviewVideoCoding),该草案还在研讨中，但是已经是第八稿了。...

大小：0B | 2020-06-18 15:43:13

3D论文_A Generic Framework for Efficient2_D and3_D Facial Expression Analogy

3D论文-A Generic Framework for Efficient 2-D and 3-D...

大小：0B | 2019-06-26 17:53:01

Analysis and Visualization of Temporal Variations in Video

MIT 2013年欧拉影像应用ppt，SIGGRAPH会议展示文章。具有很高的参考价值。

大小：0B | 2018-12-25 16:14:41

Video Partitioning by Temporal Slice Coherency

Video Partitioning by Temporal Slice Coherency 图像检...

大小：412KB | 2020-08-19 08:12:39