tensorflow gpu仅在CPU上运行 [英] tensorflow gpu is only running on CPU
问题描述
我在Windows 10和所有必要的Nvidia/Cuda软件包上安装了Anaconda-Navigator,创建了一个名为tensorflow-gpu-env的新环境,更新了PATH信息等.当我运行模型(使用tensorflow.keras
进行构建)时,我看到CPU利用率显着提高,GPU利用率为0%,并且该模型只是无法训练.
I installed Anaconda-Navigatoron Windows 10 and all necessary Nvidia/Cuda packages, created a new environment called tensorflow-gpu-env, updated PATH information, etc. When I run a model (build by using tensorflow.keras
), I see that CPU utilization increases significantly, GPU utilization is 0%, and the model just does not train.
我进行了一些测试以确保外观:
I run a couple of tests to make sure how things look:
print(tf.test.is_built_with_cuda())
True
上面的输出("True")看起来正确.
The above output ('True') looks correct.
另一种尝试:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
输出:
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 1634313269296444741
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 1478485606
locality {
bus_id: 1
links {
}
}
incarnation: 16493618810057409699
physical_device_desc: "device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0"
]
到目前为止一切顺利...在我的代码后面,我从以下代码开始训练:
So far so good... Later in my code, I start the training with the following code:
history = merged_model.fit_generator(generator=train_generator,
epochs=60,
verbose=2,
callbacks=[reduce_lr_on_plateau],
validation_data=val_generator,
use_multiprocessing=True,
max_queue_size=50,
workers=3)
我还尝试按照以下方式进行培训:
I also tried to run the training as following:
with tf.device('/gpu:0'):
history = merged_model.fit_generator(generator=train_generator,
epochs=60,
verbose=2,
callbacks=[reduce_lr_on_plateau],
validation_data=val_generator,
use_multiprocessing=True,
max_queue_size=50,
workers=3)
无论我如何开始培训,它都永远不会开始培训,我一直看到CPU利用率在GPU利用率为0%的情况下不断增加.
No matter how I start the training, it never starts the training, I keep seeing increased CPU utilization with 0% GPU utilization.
为什么我的tensorflow-gpu安装仅使用CPU?花了HOURS,实际上没有任何进展.
Why is my tensorflow-gpu installation is only using the CPU? Spent HOURS with literally no progress.
附录
当我在控制台上运行conda list
时,我看到以下关于tensorflow的信息:
When I run conda list
on the console, I see the following regarding tensorflow:
tensorflow-base 1.11.0 gpu_py36h6e53903_0
tensorflow-gpu 1.11.0 <pip>
这个基于tensorflow的基础是什么?会引起问题吗?在安装tensorflow-gpu之前,请确保同时使用conda和pip卸载了tensorflow和tensorflow-gpu.然后使用pip
安装tensorflow-gpu.我不确定这个tensorflow-gpu
安装一起提供.
What is this tensorflow-base? Can it cause a problem? Before installing tensorflow-gpu, I made sure that I uninstalled tensorflow and tensorflow-gpu by using both conda and pip; and then installed tensorflow-gpu by using pip
. I am not sure if this tensorflow-base came with my tensorflow-gpu
installation.
附录2
tensorflow-base似乎是conda的一部分,因为我可以使用conda uninstall tensorflow-base
卸载它.我仍然有tensorflow-gpu安装,但现在无法再导入tensorflow.它说:没有名为tensorflow的模块".看来我的conda环境没有看到我的tensorflor-gpu安装.此刻我很困惑.
ADDENDUM 2
It looks like tensorflow-base was a part of conda because I could uninstall it with conda uninstall tensorflow-base
. I still have tensorflow-gpu installation in place but I now cannot import tensorflow anymore. It says "No module named tensorflow". It looks like my conda environment is not seeing my tensorflor-gpu installation. I am quite confused at the moment.
推荐答案
@Smokrow,谢谢您以上的回答. 在Windows平台上,Keras似乎在多处理方面遇到了问题.
@Smokrow, thank you for your answers above. It appears to be the case that Keras seems to have problems with multiprocessing in Windows platforms.
history = merged_model.fit_generator(generator=train_generator,
epochs=60,
verbose=2,
callbacks=[reduce_lr_on_plateau],
validation_data=val_generator,
use_multiprocessing=True,
max_queue_size=50,
workers=3)
以上代码段导致Keras挂起,实际上看不到进度.如果用户在Windows上运行其代码,则use_multiprocessor需要设置为False!否则,它将不起作用. 有趣的是,仍然可以将worker设置为大于1的数字,并且仍然可以提高性能.我很难理解后台实际发生的情况,但是确实可以提高性能.因此,下面的代码使它可以工作.
The piece of code above causes the Keras to hang and literally no progress is seen. If the user is runing his/her code on Windows, use_multiprocessor needs to be set to False! Otherwise, it does not work. Interestingly, workers can still be set to a number that is greater than one and it still gives performance benefits. I am having difficulties to understand what really is happening in the background but it does give performance improvement. So the following piece of code made it work.
history = merged_model.fit_generator(generator=train_generator,
epochs=60,
verbose=2,
callbacks=[reduce_lr_on_plateau],
validation_data=val_generator,
use_multiprocessing=False, # CHANGED
max_queue_size=50,
workers=3)
这篇关于tensorflow gpu仅在CPU上运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!