为什么安装 conda 后 Tensorflow 无法识别我的 GPU? [英] Why is Tensorflow not recognizing my GPU after conda install?

查看:167
本文介绍了为什么安装 conda 后 Tensorflow 无法识别我的 GPU?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是深度学习的新手,过去 2 天我一直在尝试在我的电脑上安装 tensorflow-gpu 版本,但徒劳无功.我避免安装 CUDA 和 cuDNN 驱动程序,因为由于许多兼容性问题,几个在线论坛不推荐它.由于我之前已经在使用 python 的 conda 发行版,因此我选择了 conda install -c anaconda tensorflow-gpu,如其官方网站所述:https://anaconda.org/anaconda/tensorflow-gpu .

然而,即使在全新的虚拟环境中安装了 gpu 版本(为了避免与基础环境中安装的 pip 库的潜在冲突),由于某种神秘的原因,tensorflow 似乎甚至无法识别我的 GPU.

我运行的一些代码片段(在 anaconda 提示符下)以了解它无法识别我的 GPU:-

1.

>>>from tensorflow.python.client import device_lib>>>打印(device_lib.list_local_devices())[名称:/设备:CPU:0"设备类型:CPU"内存限制:268435456地点{}化身:7692219132769779763]

如您所见,它完全忽略了 GPU.

2.

>>>tf.debugging.set_log_device_placement(True)>>>a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])2020-12-13 10:11:30.902956:我 tensorflow/core/platform/cpu_feature_guard.cc:142] 这个TensorFlow二进制使用oneAPI深度神经网络库(oneDNN)优化使用以下CPU性能关键操作中的说明:AVX AVX2要在其他操作中启用它们,请使用适当的编译器标志重建 TensorFlow.>>>b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])>>>c = tf.matmul(a, b)>>>打印(c)tf.张量([[22.28.][49.64.]], shape=(2, 2), dtype=float32)

在这里,它应该通过显示 Executing op MatMul in device/job:localhost/replica:0/task:0/device:GPU:0(如写在这里:https://www.tensorflow.org/guide/gpu) 但没有类似的存在.另外我不确定第二行后面的消息是什么意思.

我也在网上搜索了几种解决方案,包括这里,但几乎所有问题都与第一种手动安装方法有关,因为每个人都推荐这种方法,所以我还没有尝试过.

我不再使用 cmd,因为在从基础 env 卸载 tensorflow-cpu 并重新安装后,环境变量不知何故搞砸了,它与 anaconda 提示符完美配合,但不适用于 cmd.这是一个单独的问题(也很普遍),但我提到了它,以防它在这里发挥作用.我在全新的虚拟环境中安装了 gpu 版本以确保干净安装,据我所知,仅需要为手动安装 CUDA 和 cuDNN 库设置路径变量.

我使用的卡:-(启用了 CUDA)

C:WINDOWSsystem32>wmic 路径 win32_VideoController 获取名称姓名英伟达 GeForce 940MX英特尔(R) 高清显卡 620

我目前使用的 Tensorflow 和 python 版本:-

<预><代码>>>>将张量流导入为 tf>>>tf.__version__'2.3.0'Python 3.8.5(默认,2020 年 9 月 3 日,21:29:08)[MSC v.1916 64 位 (AMD64)] :: Anaconda, Inc. on win32输入帮助"、版权"、信用"或许可证"想要查询更多的信息.

系统信息:Windows 10 Home,64 位操作系统,基于 x64 的处理器.

任何帮助将不胜感激.提前致谢.

解决方案

2021 年 8 月 Conda 安装现在可能正在运行,正如@ComputerScientist 在下面的评论中所说,conda install tensorflow-gpu==2.4.1 将给出 cudatoolkit-10.1.243cudnn-7.6.5

以下内容写于 2021 年 1 月,已过时

目前 conda install tensorflow-gpu 安装 tensorflow v2.3.0 并且不安装 conda cudnn 或 cudatoolkit 包.手动安装它们(例如使用 conda install cudatoolkit=10.1)似乎也不能解决问题.

一个解决方案是安装较早版本的tensorflow,它确实安装了cudnn和cudatoolkit,然后用pip升级

conda 安装 tensorflow-gpu=2.1pip 安装 tensorflow-gpu==2.3.1

(2.4.0 使用 cuda 11.0 和 cudnn 8.0,但是截至 2020 年 12 月 16 日,cudnn 8.0 不在 anaconda 中)

另请参阅@GZ0 的回答,该回答链接到带有单行解决方案的 github 讨论

I am new to deep learning and I have been trying to install tensorflow-gpu version in my pc in vain for the last 2 days. I avoided installing CUDA and cuDNN drivers since several forums online don't recommend it due to numerous compatibility issues. Since I was already using the conda distribution of python before, I went for the conda install -c anaconda tensorflow-gpu as written in their official website here: https://anaconda.org/anaconda/tensorflow-gpu .

However even after installing the gpu version in a fresh virtual environment (to avoid potential conflicts with pip installed libraries in the base env), tensorflow doesn't seem to even recognize my GPU for some mysterious reason.

Some of the code snippets I ran(in anaconda prompt) to understand that it wasn't recognizing my GPU:-

1.

>>>from tensorflow.python.client import device_lib
        >>>print(device_lib.list_local_devices())
                    [name: "/device:CPU:0"
                device_type: "CPU"
                memory_limit: 268435456
                locality {
                }
                incarnation: 7692219132769779763
                ]

As you can see it completely ignores the GPU.

2.

>>>tf.debugging.set_log_device_placement(True)
    >>>a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
2020-12-13 10:11:30.902956: I tensorflow/core/platform/cpu_feature_guard.cc:142] This 
TensorFlow 
binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU 
instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
>>>b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
>>>c = tf.matmul(a, b)
>>>print(c)
tf.Tensor(
[[22. 28.]
[49. 64.]], shape=(2, 2), dtype=float32)

Here, it was supposed to indicate that it ran with a GPU by showing Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0 (as written here: https://www.tensorflow.org/guide/gpu) but nothing like that is present. Also I am not sure what the message after the 2nd line means.

I have also searched for several solutions online including here but almost all of the issues are related to the first manual installation method which I haven't tried yet since everyone recommended this approach.

I don't use cmd anymore since the environment variables somehow got messed up after uninstalling tensorflow-cpu from the base env and on re-installing, it worked perfectly with anaconda prompt but not cmd. This is a separate issue (and widespread also) but I mentioned it in case that has a role to play here. I installed the gpu version in a fresh virtual environment to ensure a clean installation and as far as I understand path variables need to be set up only for manual installation of CUDA and cuDNN libraries.

The card which I use:-(which is CUDA enabled)

C:WINDOWSsystem32>wmic path win32_VideoController get name
Name
NVIDIA GeForce 940MX
Intel(R) HD Graphics 620

Tensorflow and python version I am using currently:-

>>> import tensorflow as tf
>>> tf.__version__
'2.3.0'

Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.

System information: Windows 10 Home, 64-bit operating system, x64-based processor.

Any help would be really appreciated. Thanks in advance.

解决方案

August 2021 Conda install may be working now, as according to @ComputerScientist in the comments below, conda install tensorflow-gpu==2.4.1 will give cudatoolkit-10.1.243 and cudnn-7.6.5

The following was written in Jan 2021 and is out of date

Currently conda install tensorflow-gpu installs tensorflow v2.3.0 and does NOT install the conda cudnn or cudatoolkit packages. Installing them manually (e.g. with conda install cudatoolkit=10.1) does not seem to fix the problem either.

A solution is to install an earlier version of tensorflow, which does install cudnn and cudatoolkit, then upgrade with pip

conda install tensorflow-gpu=2.1
pip install tensorflow-gpu==2.3.1

(2.4.0 uses cuda 11.0 and cudnn 8.0, however cudnn 8.0 is not in anaconda as of 16/12/2020)

Edit: please also see @GZ0's answer, which links to a github discussion with a one-line solution

这篇关于为什么安装 conda 后 Tensorflow 无法识别我的 GPU?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆