深度学习的稀疏GPU内核

上传者：qqnickel82600 2021-01-22 05:47:28上传 .PDF文件 1.49 MB 热度 58次

传统上，科学工作负载利用高水平的稀疏性来加速计算并减少内存需求。虽然可以使稀疏的神经网络稀疏，但在GPU上实现实际的加速却很困难，因为这些应用程序具有相对中等的稀疏度，不足以使现有的稀疏内核胜过密集的稀疏内核。..

Sparse GPU Kernels for Deep Learning

Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because these applications have relatively moderate levels of sparsity that are not sufficient for existing sparse kernels to outperform their dense counterparts.In this work, we study sparse matrices from deep learning applications and identify favorable properties that can be exploited to accelerate computation. Based on these insights, we develop high-performance GPU kernels for two sparse matrix operations widely applicable in neural networks: sparse matrix-dense matrix multiplication and sampled dense-dense matrix multiplication. Our kernels reach 27% of single-precision peak on Nvidia V100 GPUs. Using our kernels, we demonstrate sparse Transformer and MobileNet models that achieve 1.2-2.1x speedups and up to 12.8x memory savings without sacrificing accuracy.

下载地址

用户评论

更多下载

下载地址

立即下载

用户评论

深度学习的稀疏GPU内核

Scientific workloads have traditionally exploited ...

大小：1.49 MB | 2021-01-22 05:47:28
BabelGPU GPU深度学习架构

BabelGPU: GPU深度学习架构在当前的计算领域，GPU（图形处理单元）已经超越了其在图形渲...

大小：998.54KB | 2024-10-15 13:21:10
用于深度CNN的基于预定义稀疏内核的卷积

The high demand for computational and storage reso...

大小：1.84 MB | 2021-01-23 05:30:58
Python深度学习GPU训练系统

深度学习GPU训练系统

大小：0B | 2020-05-28 11:40:25
GPU深度学习cudnn5.1

cudnn升级

大小：0B | 2020-06-18 04:21:09
稀疏自编码深度学习的Matlab实现

稀疏自编码深度学习的Matlab实现，sparseAutocoding，Matlabcode

大小：0B | 2019-05-19 13:26:31
斯坦福深度学习稀疏编码的代码

网址：http://deeplearning.stanford.edu/wiki/index.php...

大小：0B | 2019-05-05 13:37:31
DIGITS深度学习GPU培训系统源码

数字数字(在d EEP学习摹PU牛逼下雨变体系)是培养深度学习模型web应用程序。当前支持的框架...

大小：36.17MB | 2021-02-18 01:12:40
相比GPU和GPPFPGA是深度学习的未来

相比GPU和GPP,FPGA在满足深度学习的硬件需求上提供了具有吸引力的替代方案。凭借流水线并行计算...

大小：126KB | 2020-08-18 14:05:32
VC2013深度学习稀疏自编码

一个用VC2013写的深度学习（稀疏自编码）程序：（程序中用到了自己写的一个静态链接库dplrnli...

大小：0B | 2020-05-13 18:01:09
深度学习框架TensorFlow-GPU测试代码

深度学习框架TensorFlow-GPU测试代码是一个示例程序，用于展示如何导入TensorFlow...

大小：567B | 2023-06-28 19:33:29
Windows下PyTorch深度学习环境（GPU）配置

确保电脑有支持CUDA 的显卡。安装最新版本的 PyTorch 和 CUDA。配置环境变量以指向正确...

大小：1.71MB | 2024-05-09 08:25:41
hebel Python中的GPU加速的深度学习库源码

赫贝尔 Python中的GPU加速的深度学习库 Hebel是一个用于Python深度神经网络学习的库...

大小：132KB | 2021-04-05 14:54:36
详解稀疏编码百度深度学习ppt

百度深度学习研究院的研究介绍，稀疏编码的详解，简洁易领会，是学习稀疏编码的很好资料

大小：0B | 2019-09-03 10:31:14
深度学习入门必备：使用GPU提升训练速度

在当今的计算机科学领域中，使用GPU进行深度学习是至关重要的。随着机器学习算法的复杂性不断提升，一般...

大小：71.76MB | 2023-04-27 15:29:34
deeppipe2使用GPU CUDAcuBLAS的深度学习库源码

deeppipe2:使用GPU(CUDAcuBLAS)的深度学习库

大小：214.19MB | 2021-02-08 21:36:53