如何解决有关 Tensorflow 和 cuda 兼容性的问题? [英] How to resolve the issue regarding Tensorflow and cuda compatibility?
问题描述
错误:
UnknownError:无法获得卷积算法.这大概是因为 cuDNN 初始化失败,所以尝试查看是否有警告上面打印了日志消息.[操作:Conv2D]
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
用于安装包的命令:
conda install -c anaconda keras-gpu
已安装:
- 张量流 2.0.0
- cudatoolkit 10.0.130 0
- cudnn 7.6.5
- cuda10.0_0
keras-gpu 2.2.4 0
- tensorflow 2.0.0
- cudatoolkit 10.0.130 0
- cudnn 7.6.5
- cuda10.0_0
keras-gpu 2.2.4 0
tf.test.is_gpu_available()
返回 True
推荐答案
UnknownError:无法获得卷积算法.这大概是因为 cuDNN 初始化失败,所以尝试查看是否有警告上面打印了日志消息.[操作:Conv2D]
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
这个问题的出现通常是由于安装的 CUDA 和 cuDNN 驱动程序之间不兼容.请参考 Linux 和 CPU.
Generally this problem appears due to incompatibility between CUDA and cuDNN drivers installed. Please refer tested build configuration for Linux an CPU.
如果 tf.test.is_gpu_available()
返回 True
表示安装没有问题.
If tf.test.is_gpu_available()
returns True
means, there is nothing wrong with installation.
所以在下一步中,您可以尝试通过允许 GPU 内存增长来管理 GPU 内存资源.
So in the next step, you can try GPU memory resources management by allowing GPU memory growth.
它可以通过调用 tf.config.experimental.set_memory_growth
来完成,它尝试仅分配运行时分配所需的 GPU 内存:它开始分配很少的内存,并且作为程序开始运行,需要更多的 GPU 内存,我们扩展了分配给 TensorFlow 进程的 GPU 内存区域.
It can be done by calling tf.config.experimental.set_memory_growth
, which attempts to allocate only as much GPU memory as needed for the runtime allocations: it starts out allocating very little memory, and as the program gets run and more GPU memory is needed, we extend the GPU memory region allocated to the TensorFlow process.
要为特定 GPU 开启内存增长,请在分配任何张量或执行任何操作之前使用以下代码.
To turn on memory growth for a specific GPU, use the following code prior to allocating any tensors or executing any ops.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
另一种启用内存增长的方法是将环境变量 TF_FORCE_GPU_ALLOW_GROWTH
设置为 true
.此配置特定于平台.
Another method to enable memory growth is to set the environmental variable TF_FORCE_GPU_ALLOW_GROWTH
to true
. This configuration is platform specific.
在此方法中,使用 tf.config.experimental.set_virtual_device_configuration
配置虚拟 GPU 设备,并对要在 GPU 上分配的总内存设置硬限制.
In this method to configure a virtual GPU device with tf.config.experimental.set_virtual_device_configuration
and set a hard limit on the total memory to allocate on the GPU.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)
有关详细信息,请参阅此处
这里有一个建议,您可以尝试安装 tensorflow-gpu
而不是 keras-gpu
.
One suggestion here, you can try installing tensorflow-gpu
instead of keras-gpu
.
这篇关于如何解决有关 Tensorflow 和 cuda 兼容性的问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!