训练某些网络时，GPU上的Keras(Tensorflow后端)比CPU上慢 [英] Keras (Tensorflow backend) slower on GPU than on CPU when training certain networks

查看：97 发布时间：2021/5/13 18:41:06 performance tensorflow gpu cpu keras

本文介绍了训练某些网络时，GPU上的Keras(Tensorflow后端)比CPU上慢的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我很难确切地理解为什么小尺寸网络的GPU和CPU速度类似(CPU有时更快)而大尺寸网络的GPU更快.问题底部的代码在i7-6700k上的运行时间为103.7s，但是当使用tensorflow-gpu时，代码的运行时间为29.5秒.

I am having some difficulty understanding exactly why the GPU and CPU speeds are similar with networks of small size (CPU is sometimes faster), and GPU is faster with networks of larger size. The code at the bottom of the question runs in 103.7s on an i7-6700k, but when using tensorflow-gpu, the code runs in 29.5 seconds.

但是，当我训练一个具有100个隐藏神经元的网络时，而不是下面的示例中的1000个，当使用GPU时，我得到的时间约为20秒，而使用CPU时则为15秒.

However, when I train a network that has 100 hidden neurons, instead of 1000 like in the example below, I get ~20 seconds when using the GPU, and ~15 seconds when using the CPU.

我在另一个堆栈溢出答案上读到，CPU-> GPU传输需要很长时间，我假设这是在将数据示例加载到GPU上的参考.

I read on another stack overflow answer that CPU->GPU transfers take long, I'm assuming this is in reference to loading the data examples on the GPU.

有人可以解释为什么会发生这种情况，并可能引用我可以做的代码更改以最大程度地提高速度吗?

Can someone explain why this occurs, and possibly reference some change in the code that I can make to maximize speed?

import numpy as np
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.utils import np_utils
from keras.layers.core import Dense, Activation, Flatten, Dropout
from sklearn.preprocessing import normalize

## Importing the MNIST dataset using Keras
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# reshape for vector input
N, x, y = X_train.shape
X_train = normalize(np.reshape(X_train, (N, x * y)))

N, x, y = X_test.shape
X_test = normalize(np.reshape(X_test, (N, x * y)))

# one-hot encoding
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

model = Sequential()
model.add(Dense(output_dim=750, input_dim=784))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Dense(150))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Dense(50))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Dense(50))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Nadam', metrics=['accuracy'])

fit = model.fit(X_train, y_train, batch_size=128, nb_epoch=10, verbose=0)

## Printing the accuracy of our model, according to the loss function specified in model.compile above
score = model.evaluate(X_test, y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])

训练某些网络时，GPU上的Keras(Tensorflow后端)比CPU上慢 [英] Keras (Tensorflow backend) slower on GPU than on CPU when training certain networks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

训练某些网络时，GPU上的Keras(Tensorflow后端)比CPU上慢 [英] Keras (Tensorflow backend) slower on GPU than on CPU when training certain networks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭