Keras看到我的GPU但在训练神经网络时不使用它 [英] Keras sees my GPU but doesn't use it when training a neural network

查看:829
本文介绍了Keras看到我的GPU但在训练神经网络时不使用它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Keras/TensorFlow不使用我的GPU.

My GPU is not used by Keras/TensorFlow.

为了尝试使我的GPU与tensorflow一起使用,我通过pip安装了tensorflow-gpu(我在Windows上使用的是Anaconda)

To try to make my GPU working with tensorflow, I installed tensorflow-gpu via pip (I am using Anaconda on Windows)

我有nvidia 1080ti

I have nvidia 1080ti

print(tf.test.is_gpu_available())

True

print(tf.config.experimental.list_physical_devices())

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), 
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

我被绑住

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

但没有帮助

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))
print(sess)

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

<tensorflow.python.client.session.Session object at 0x000001A2A3BBACF8>

仅来自tf的警告:

W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows 

整个日志:

2019-10-18 20:06:26.094049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2019-10-18 20:06:35.078225: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-10-18 20:06:35.090832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-10-18 20:06:35.180744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:35.185505: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:35.189328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:06:35.898592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:06:35.901683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:06:35.904235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:06:35.906687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:06:38.694481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:38.700482: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:38.704020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[I 20:06:47.324 NotebookApp] Saving file at /Untitled.ipynb
2019-10-18 20:07:22.227110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:22.246012: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:22.261643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:22.272150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:22.275457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:22.277980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:22.316260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2019-10-18 20:07:32.986802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:32.990509: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:32.993763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:32.995570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:32.997920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:32.999435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:33.001380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:07:36.048204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-10-18 20:07:37.971703: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-10-18 20:07:38.576861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll

还尝试使用pip重新安装tensorflow-gpu

also tried reinstalling tensorflow-gpu with pip

为什么我认为GPU不起作用? -因为我的python内核使用的CPU为99%,RAM为99%,有时GPU为7%,但大多数情况下为0
我使用自定义数据生成器,但现在它仅选择批次并调整其大小(skimage.io.resize) 1个时期〜44秒 还具有每10个样本随机出现一次冻结的奇怪行为,并且几乎不会冻结最后一个样本(37/38)(〜10-15秒)

Why I think GPU doesnt work? - Because my python kernel uses CPU 99%, RAM 99% and sometimes GPU ~7% but most of time its 0
I use custom data generator but now its only selects batches and resizes them(skimage.io.resize) 1 epoch ~ 44s Also has strange behavior of freezing in random points every ~10 samples and freezes hardly on last sample(37/38)(~10-15 sec)

我在此处

train_gen = DataGenerator(x = x_train,
                              y = y_train,
                              batch_size = 128,
                              target_shape = (100, 100, 3), 
                              sample_std = False,
                              feature_std = False,
                              proj_parameters = None,
                              blur_parameters = None,
                              nois_parameters = None,
                              flip_parameters = None,
                              gamm_parameters = None)

验证是相同的

因此,它是一个解决问题的生成器,但是我该如何解决呢?
我只使用了skimage和numpy操作

So its a generator that couses the problem, but how i can fix it?
I used only skimage and numpy operations

推荐答案

日志显示GPU确实已使用.几乎可以肯定,您会遇到IO瓶颈:您的GPU处理CPU所抛出的异常的速度快于CPU可以加载和预处理它的速度.这在深度学习中很常见,并且有解决的方法.

The logs are showing that the GPU does get used. You are almost certainly running into an IO bottleneck: your GPU is processing whatever the CPU is throwing at it way faster than the CPU can load and preprocess it. This is very common in deep learning, and there are ways to address it.

如果不了解您的数据管道(批处理的字节大小,预处理步骤等)以及如何存储数据,我们将无法提供很多帮助.一种加快速度的典型方法是存储数据,是一种二进制格式,例如TFRecords,以便CPU可以更快地加载它.请参阅官方文档.

We cannot provide a lot of help without knowing more about your data pipeline (byte size of a batch, preprocessing steps, ...), and how the data is stored. One typical way to speed things up is to store the data is a binary format, like TFRecords, so that the CPU can load it faster. See the official documentation for this.

我很快通过了您的输入管道.这个问题很可能是由IO造成的:

I quickly went through your input pipeline. The issue is very likely to indeed by IO:

  • 您也应该在GPU上运行预处理步骤,tf.image中实现了许多您使用的增强技术.如果可以的话,您应该考虑使用Tensorflow 2.0,因为它包含Keras,并且那里也有很多帮助程序.
  • 检出tf.data.Dataset API,它具有大量的帮助程序,可将所有数据加载到不同的线程中,从而可以根据您拥有的内核数来大致加快该过程.
  • 您应该将图像存储为TFRecords.如果您输入的图像很小,这可能会将加载速度提高一个数量级.
  • 您也可以尝试使用更大的批量,我想您的图像可能真的很小.
  • You should run the preprocessing steps on the GPU as well, plenty of the augmentation techniques you use are implemented in tf.image. If you can, you should think about using Tensorflow 2.0, because it includes Keras and there are plenty of helpers in there as well.
  • Checkout the tf.data.Dataset API, it has plenty of helpers to load all the data in different threads, which can roughly speed up the process by the number of cores you have.
  • You should store your images as TFRecords. This is likely to speed up the loading by an order of magnitude if your input images are smallish.
  • You could probably try larger batch sizes as well, I'm thinking your images are probably really small.

这篇关于Keras看到我的GPU但在训练神经网络时不使用它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆