Tensorflow 2.0rc未检测到GPU [英] Tensorflow 2.0rc not detecting GPUs

查看:188
本文介绍了Tensorflow 2.0rc未检测到GPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TF2当前未检测到GPU,我从TF1.14迁移到了

TF2 is currently not detecting GPUs, I migrated from TF1.14 where using

tf.keras.utils.multi_gpu_model(model=model, gpus=2)

现在返回错误

ValueError: To call `multi_gpu_model` with `gpus=2`, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0', '/xla_cpu:0', '/xla_gpu:0', '/xla_gpu:1', '/xla_gpu:2', '/xla_gpu:3']. Try reducing `gpus`.

运行 nvidia-smi 返回以下信息

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:09:00.0 Off |                    0 |
| N/A   46C    P0    62W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:0A:00.0 Off |                    0 |
| N/A   36C    P0    71W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:86:00.0 Off |                    0 |
| N/A   38C    P0    58W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:87:00.0 Off |                    0 |
| N/A   31C    P0    82W / 149W |      0MiB / 11441MiB |     73%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

也是我的TF版本,它是为cuda构建的

Also my TF version and is built for cuda

2.0.0-rc0

请让我知道我在做什么错,以便可以解决.

Please let me know what I am doing wrong so I can fix it.

推荐答案

我建议您-

  1. 请先检查您的Cuda版本.确保它是10.0.

  1. Please first check your Cuda version. Make sure it is 10.0.

如果是10.0,请检查您的TF版本是否适用于GPU.

If that is 10.0, then check your TF version if it is for GPU or not.

使用命令检查TF是否可以访问GPU

Check if TF can access the GPUs using the command

value = tf.test.is_gpu_available(
    cuda_only=False,
    min_cuda_compute_capability=None
)
print ('***If TF can access GPU: ***\n\n',value) # MUST RETURN True IF IT CAN!!

  1. 我认为您已经照顾了前2分.如果TF也可以访问您的GPU,那么正如您在 Value error 中看到的那样,它实际上具有GPU的名称.我不能说 tf.keras.utils.multi_gpu_model()函数,因为我没有在TF中使用它.但我建议您将与tf.device('/gpu:0'):一起使用.在其中,您可以调用模型或定义模型.
  2. 如果第4点也不起作用,则只需添加以下几行
  1. I assume first 2 points are already taken care of by you. If TF can also access your GPUs then, as you can see in your Value error, it has names of GPUs actually. I can not say about tf.keras.utils.multi_gpu_model() function because I did not use it in TF. But I would suggest you to use with tf.device('/gpu:0'):. Inside this you call your model or define the model.
  2. If point 4 also doesn't work, then just add following lines

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3" # 0,1,2,3 are number of GPUs

位于文件顶部,然后使用tf.device('/gpu:0')

at the top of your file and remove with tf.device('/gpu:0')

这篇关于Tensorflow 2.0rc未检测到GPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆