TensorFlow CUDA_ERROR_OUT_OF_MEMORY [英] TensorFlow CUDA_ERROR_OUT_OF_MEMORY

查看：44 发布时间：2021/9/5 19:26:15 tensorflow

本文介绍了TensorFlow CUDA_ERROR_OUT_OF_MEMORY的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 TensorFlow 中构建一个大型 CNN，并打算在多 GPU 系统上运行它.我采用了塔式"系统并为两个 GPU 拆分批次，同时将变量和其他计算保留在 CPU 上.我的系统有 32GB 的内存，但是当我运行我的代码时出现错误:

I'm trying to build a large CNN in TensorFlow, and intend to run it on a multi-GPU system. I've adopted a "tower" system and split batches for both GPUs, while keeping the variables and other computations on the CPU. My system has 32GB of memory, but when I run my code I get the error:

E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to alloc 17179869184 bytes on host: CUDA_ERROR_OUT_OF_MEMORY
W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 17179869184
Killed

如果我将 CUDA 设备隐藏到 TensorFlow，我已经看到代码可以工作(虽然非常慢)，因此它不使用 cudaMallocHost()...

I've seen that the code works (though very very slowly) if I hide CUDA devices to TensorFlow, and thus it doesn't use cudaMallocHost()...

感谢您抽出宝贵时间.

推荐答案

有一些选择:

1- 减少批量大小

2- 使用内存增长:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

3- 不要分配整个 GPU 内存(仅 90%):

3- don't allocate whole of your GPU memory(only 90%):

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9
session = tf.Session(config=config, ...)

这篇关于TensorFlow CUDA_ERROR_OUT_OF_MEMORY的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

TensorFlow CUDA_ERROR_OUT_OF_MEMORY [英] TensorFlow CUDA_ERROR_OUT_OF_MEMORY

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

TensorFlow CUDA_ERROR_OUT_OF_MEMORY [英] TensorFlow CUDA_ERROR_OUT_OF_MEMORY

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭