Keras中的5层DNN使用GPU训练速度较慢 [英] 5-layer DNN in Keras trains slower using GPU

查看:513
本文介绍了Keras中的5层DNN使用GPU训练速度较慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Keras 1.2中使用tensorflow-gpu作为后端编写了一个5层密集网络,并在我的MacBookPro(CPU)和AWS的P2.xlarge实例中对其进行了训练(K80-启用了cuda).令人惊讶的是,我的MacBookPro训练模型的速度比P2实例快.我已经检查过该模型是使用P2中的GPU训练的,所以我想知道...为什么它运行速度较慢?

I've written a 5-layer dense network in Keras 1.2 using tensorflow-gpu as backend and train it in my MacBookPro (CPU) and in a P2.xlarge instance in AWS (K80 - cuda enabled). Surprisingly my MacBookPro trains the model faster than the P2 instance. I've checked that the model is trained using the GPU in P2, so I wonder... why does it run slower?

这是网络:

model = Sequential()
model.add(Dense(250, input_dim=input_dim, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(130, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(50, init='normal', activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(10, init='normal', activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(1, init='normal'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=[metrics.mae])
model.fit(x=X, y=Y, batch_size=batch, nb_epoch=epochs, shuffle=True, validation_data=(X_test, Y_test), verbose=2))

谢谢

亚历克斯.

推荐答案

我在一个小型网络中遇到了类似的问题-发现挂钟时间主要是由于CPU计算以及CPU和GPU之间的数据传输,特别是数据传输时间要比从GPU计算(而不是CPU)获得的收益要大.

I ran into a similar problem with a small network - and discovered that the wall clock time was largely due to CPU computations and data transfer between the CPU and the GPU, and specifically that the data transfer time was larger than the gains seen from doing GPU computations instead of CPU.

由于没有可供测试的数据,我的假设是类似地,您的网络太小而无法推动GPU的真正功能,而看到GPU训练时间更长的原因是,您的网络需要花费更多时间来传输GPU.通过在GPU上进行计算,GPU和CPU之间的数据获得的性能得到了提高.

Without data to test on, my assumption is that similarly your network is too small to push the true power of the GPU and the reason you're seeing larger training time with the GPU is that your network takes more time to transfer the data between the GPU and CPU than it is gaining in performance increases from doing computations on the GPU.

您尝试过一个明显更大的网络吗?

Have you tried a noticeably larger network?

这篇关于Keras中的5层DNN使用GPU训练速度较慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆