训练LSTM/RNN模型时,为什么我的GPU比CPU慢? [英] Why is my GPU slower than CPU when training LSTM/RNN models?
问题描述
我的机器具有以下规格:
My machine has the following spec:
CPU:至强E5-1620 v4
CPU: Xeon E5-1620 v4
GPU:Titan X(帕斯卡)
GPU: Titan X (Pascal)
Ubuntu 16.04
Ubuntu 16.04
Nvidia驱动程序375.26
Nvidia driver 375.26
CUDA接受了8.0
CUDA tookit 8.0
cuDNN 5.1
cuDNN 5.1
我已经使用Tensorflow作为后备参考:
I've benchmarked on the following Keras examples with Tensorflow as the backed reference:
SCRIPT NAME GPU CPU
stated_lstm.py 5sec 5sec
babi_rnn.py 10sec 12sec
imdb_bidirectional_lstm.py 240sec 116sec
imbd_lstm.py 113sec 106sec
在非lstm模型中,我的GPU显然要胜过执行我的cpu.
My gpu is clearly out performing my cpu in non-lstm models.
SCRIPT NAME GPU CPU
cifar10_cnn.py 12sec 123sec
imdb_cnn.py 5sec 119sec
mnist_cnn.py 3sec 47sec
还有其他人经历过吗?
Has anyone else experienced this?
推荐答案
如果使用Keras,请使用 CuDNNLSTM 代替 LSTM 或 GRU .以我的情况(2台Tesla M60)而言,我看到的性能提高了10倍.顺便说一句,我使用的是@Alexey Golyshev建议的批量大小128.
If you use Keras, use CuDNNLSTM in place of LSTM or CuDNNGRU in place of GRU. In my case (2 Tesla M60), I am seeing 10x boost of performance. By the way I am using batch size 128 as suggested by @Alexey Golyshev.
这篇关于训练LSTM/RNN模型时,为什么我的GPU比CPU慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!