Tensorflow/Windows(本机):GPU 支持 [无法识别 NUMA 节点] [英] Tensorflow / Windows (native) : GPU support [Could not identify NUMA node]

查看:138
本文介绍了Tensorflow/Windows(本机):GPU 支持 [无法识别 NUMA 节点]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是关于在支持 GPU 的 Windows 上运行原生 Tensorflow (v0.12)

This is about running Tensorflow native on Windows with GPU support (v0.12)

虽然一些示例工作 (matmul.py) 并且我可以看到 GPU (1.3s) 与 CPU (4.4s) 的巨大性能差异,但我确实遇到了一个示例问题:

While some examples work (matmul.py) and I can see a big performance difference with GPU (1.3s) versus CPU (4.4s), I do get an issue with one example:

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] 无法识别/job:localhost/的 NUMA 节点replica:0/task:0/gpu:0,默认为 0.您的内核可能没有使用 NUMA 支持构建.

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0. Your kernel may not have been built with NUMA support.

虽然其他人在未加载 cuDNN 库时遇到问题,但我的库已正确找到并加载:

While others have had a problem with the library for cuDNN not being loaded, my library is correctly found and loading:

我 c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] 在本地成功打开了 CUDA 库 cudnn64_5.dll

I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cudnn64_5.dll locally

有人遇到同样的问题吗?有人能解决吗?我可以做些什么来更多地记录发生了什么问题吗?

Does anybody have the same issue? Was anybody able to solve it? Can I do something to get more logging about what is going wrong?

推荐答案

虽然TensorFlow在产生此消息,您可能可以忽略它,除非您在多 GPU 配置中运行,不同的 GPU 连接到不同的 NUMA 节点.正如

Although TensorFlow reports an error when this message is produced, you can probably ignore it, unless you are running in a multiple-GPU configuration with different GPUs attached to different NUMA nodes. As the comment in the code says:

if (numa_node < 0) {
  // For some reason the StreamExecutor couldn't get the NUMA
  // affinity of the GPU.  If this is not a multi-socket mobo with
  // GPUs local to different buses, it doesn't matter.  If it is, we
  // may run into trouble later with data transfer operations.  The
  // trouble may manifest as slower than expected performance, or
  // outright failures.
  LOG(ERROR) << "Could not identify NUMA node of " << name
             << ", defaulting to 0.  Your kernel may not have been built "
                "with NUMA support.";
  numa_node = 0;
}

事实证明,GitHub 问题中告诉我们,因此我们可以优先添加此支持.

As it turns out, the code to discover NUMA nodes is only implemented on Linux, as it uses SysFS. If you are running a big-iron Windows server with multiple GPUs and NUMA, please let us know in a GitHub issue, so we can prioritize adding this support.

这篇关于Tensorflow/Windows(本机):GPU 支持 [无法识别 NUMA 节点]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆