为什么在安装conda之后Tensorflow无法识别我的GPU? [英] Why is Tensorflow not recognizing my GPU after conda install?

查看:2238
本文介绍了为什么在安装conda之后Tensorflow无法识别我的GPU?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是深度学习的新手,并且在过去两天里,我一直在徒劳地尝试在我的PC中安装tensorflow-gpu版本.我避免安装CUDA和cuDNN驱动程序,因为由于众多兼容性问题,一些在线论坛都不建议这样做.由于以前我已经使用过python的conda发行版,因此我去了 conda install -c anaconda tensorflow-gpu ,就像在他们的官方网站上写的那样:

I am new to deep learning and I have been trying to install tensorflow-gpu version in my pc in vain for the last 2 days. I avoided installing CUDA and cuDNN drivers since several forums online don't recommend it due to numerous compatibility issues. Since I was already using the conda distribution of python before, I went for the conda install -c anaconda tensorflow-gpu as written in their official website here: https://anaconda.org/anaconda/tensorflow-gpu .

但是,即使在全新的虚拟环境中安装了gpu版本(为避免与基本环境中pip安装的库可能发生冲突),出于某种神秘的原因,tensorflow甚至都无法识别我的GPU.

However even after installing the gpu version in a fresh virtual environment (to avoid potential conflicts with pip installed libraries in the base env), tensorflow doesn't seem to even recognize my GPU for some mysterious reason.

我运行了一些代码段(在anaconda提示符下)以了解它无法识别我的GPU:-

Some of the code snippets I ran(in anaconda prompt) to understand that it wasn't recognizing my GPU:-

1.

>>>from tensorflow.python.client import device_lib
        >>>print(device_lib.list_local_devices())
                    [name: "/device:CPU:0"
                device_type: "CPU"
                memory_limit: 268435456
                locality {
                }
                incarnation: 7692219132769779763
                ]

如您所见,它完全忽略了GPU.

As you can see it completely ignores the GPU.

2.

>>>tf.debugging.set_log_device_placement(True)
    >>>a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
2020-12-13 10:11:30.902956: I tensorflow/core/platform/cpu_feature_guard.cc:142] This 
TensorFlow 
binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU 
instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>>b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
>>>c = tf.matmul(a, b)
>>>print(c)
tf.Tensor(
[[22. 28.]
[49. 64.]], shape=(2, 2), dtype=float32)

这里,应该通过显示在设备/job:localhost/replica:0/task:0/device:GPU:0 中执行op MatMul来表明它与GPU一起运行(如写在这里: https://www.tensorflow.org/guide/gpu ),但没有类似的东西存在.另外,我不确定第二行之后的消息是什么意思.

Here, it was supposed to indicate that it ran with a GPU by showing Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0 (as written here: https://www.tensorflow.org/guide/gpu) but nothing like that is present. Also I am not sure what the message after the 2nd line means.

我也在网上搜索了几种解决方案,包括此处,但是几乎所有问题都与第一种手动安装方法有关,自从每个人都推荐这种方法以来,我还没有尝试过.

I have also searched for several solutions online including here but almost all of the issues are related to the first manual installation method which I haven't tried yet since everyone recommended this approach.

我不再使用cmd了,因为从基本环境中卸载tensorflow-cpu并重新安装后,环境变量莫名其妙地被弄乱了,它与anaconda提示符完美兼容,但与cmd兼容.这是一个单独的问题(也是普遍存在的),但我提到它是为了在这里发挥作用.我将gpu版本安装在全新的虚拟环境中,以确保全新安装,并且据我了解,仅对于手动安装CUDA和cuDNN库需要设置路径变量.

I don't use cmd anymore since the environment variables somehow got messed up after uninstalling tensorflow-cpu from the base env and on re-installing, it worked perfectly with anaconda prompt but not cmd. This is a separate issue (and widespread also) but I mentioned it in case that has a role to play here. I installed the gpu version in a fresh virtual environment to ensure a clean installation and as far as I understand path variables need to be set up only for manual installation of CUDA and cuDNN libraries.

我使用的卡:-(已启用CUDA)

The card which I use:-(which is CUDA enabled)

C:\WINDOWS\system32>wmic path win32_VideoController get name
Name
NVIDIA GeForce 940MX
Intel(R) HD Graphics 620

我当前正在使用的Tensorflow和python版本:-

Tensorflow and python version I am using currently:-

>>> import tensorflow as tf
>>> tf.__version__
'2.3.0'

Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.

系统信息:Windows 10 Home,64位操作系统,基于x64的处理器.

System information: Windows 10 Home, 64-bit operating system, x64-based processor.

任何帮助将不胜感激.预先感谢.

Any help would be really appreciated. Thanks in advance.

推荐答案

当前 conda install tensorflow-gpu 安装tensorflow v2.3.0,并且不安装conda cudnn或cudatoolkit软件包.手动安装它们(例如,使用 conda install cudatoolkit = 10.1 )似乎也无法解决问题.

Currently conda install tensorflow-gpu installs tensorflow v2.3.0 and does NOT install the conda cudnn or cudatoolkit packages. Installing them manually (e.g. with conda install cudatoolkit=10.1) does not seem to fix the problem either.

一种解决方案是安装tensorflow的早期版本,该版本的确会安装cudnn和cudatoolkit,然后使用pip升级

A solution is to install an earlier version of tensorflow, which does install cudnn and cudatoolkit, then upgrade with pip

conda install tensorflow-gpu=2.1
pip install tensorflow-gpu==2.3.1

(2.4.0使用cuda 11.0和cudnn 8.0,但是从2020年12月16日开始,anadonda中没有cudnn 8.0)

(2.4.0 uses cuda 11.0 and cudnn 8.0, however cudnn 8.0 is not in anaconda as of 16/12/2020)

另请参阅@ GZ0的答案,该答案链接到具有一站式解决方案的github讨论

please also see @GZ0's answer, which links to a github discussion with a one-line solution

这篇关于为什么在安装conda之后Tensorflow无法识别我的GPU?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆