gpu 上的 tensorflow 2.0 自定义层 [英] tensorflow 2.0 custom layers on gpu

查看:29
本文介绍了gpu 上的 tensorflow 2.0 自定义层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TensorFlow 中完全定制的层会自动在 GPU 上运行吗?我注意到在这个文件 (https://www.tensorflow.org/guide/keras/rnn#rnn_layers_and_rnn_cells) 似乎 RNN 包装器不会使用 CudNN?这意味着它不会在 GPU 上运行对吗?

Will completely custom-made layers in TensorFlow automatically be run on GPUs? I noticed that in this document (https://www.tensorflow.org/guide/keras/rnn#rnn_layers_and_rnn_cells) it seems that the RNN wrappers won't be using CudNN? That means it wouldn't run on the GPU right?

推荐答案

您的自定义层仍将使用 GPU,您可以按照此 答案.

Your custom layers will still use the GPU and you can confirm that as explained in this answer.

虽然自定义层不会使用 cuDNN,但您是对的.为什么这有关系?在 NVidia 之后引用:

You are right though that the custom layers won't use cuDNN. Why does it matter? To quote after NVidia:

cuDNN 为标准例程提供高度调整的实现,例如前向和后向卷积、池化、归一化和激活层

cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers

换句话说,使用这些优化的原语将提高训练的性能.cuDNN: Efficient Primitives for Deep Learning 论文中提供了详细解释的示例数量.以空间卷积为例.非优化实现将使用天真"方法,而 cuDNN 使用各种技巧来减少操作数量并适当地对它们进行批处理.与经典 CPU 相比,GPU 仍然很快,cuDNN 只是让它更快.有关最新的独立基准,请查看例如这篇文章.

In other words, using these optimised primitives will enhance performance of the training. Number of examples with detailed explanation is provided in the cuDNN: Efficient Primitives for Deep Learning paper. Take for instance spatial convolutions. Non-optimised implementation would use "naive" approach, while cuDNN uses all sorts of tricks to reduce number of operations and batch them appropriately. GPU is still fast when compared to classical CPU, cuDNN just makes it faster. For more recent, independent benchmarks, check out e.g. this article.

不过,如果 Tensorflow 在 GPU 模式下运行,完整的计算图将在 GPU 上执行(据我所知,甚至没有简单的方法可以取出图的一部分,即中间层,放在 CPU 上).

Still, if Tensorflow runs in the GPU mode, complete computational graph will be executed on the GPU (to my knowledge there's even no simple way you could take out portion of the graph, i.e. intermediate layer, and put on CPU).

这篇关于gpu 上的 tensorflow 2.0 自定义层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆