运行多GPU CUDA示例(SimpleP2P)时P2P内存访问失败 [英] P2P memory access fail while running multi-GPU CUDA sample (simpleP2P)

查看:275
本文介绍了运行多GPU CUDA示例(SimpleP2P)时P2P内存访问失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决运行CUDA示例中包含的simpleP2P示例程序时发现的错误.错误如下:

I'm trying to troubleshoot an error I found while running the simpleP2P sample program, included in the CUDA samples. The error is as follows:

$ ./simpleP2P 
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
> GPU0 = "     Tesla K20c" IS  capable of Peer-to-Peer (P2P)
> GPU1 = "     Tesla K20c" IS  capable of Peer-to-Peer (P2P)

Checking GPU(s) for support of peer to peer memory access...
> Peer-to-Peer (P2P) access from Tesla K20c (GPU0) -> Tesla K20c (GPU1) : No
> Peer-to-Peer (P2P) access from Tesla K20c (GPU1) -> Tesla K20c (GPU0) : No
Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P.
Peer to Peer access is not available between GPU0 <-> GPU1, waiving test.

我正在使用的设备如下:

The devices I'm using are the following:

$ lspci | grep NVIDIA
03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)

从nvidia-smi获得的有关连接的其他信息:

Additional information concerning connectivity obtained from nvidia-smi:

$ nvidia-smi topo -m
    GPU0    GPU1    CPU Affinity
GPU0     X  SOC 0-5,12-17
GPU1    SOC  X  6-11,18-23

Legend:

  X   = Self
  SOC = Path traverses a socket-level link (e.g. QPI)
  PHB = Path traverses a PCIe host bridge
  PXB = Path traverses multiple PCIe internal switches
  PIX = Path traverses a PCIe internal switch

最后从lspci工具输出的详细信息.

Finally more verbose output from lspci tool.

03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
        Subsystem: NVIDIA Corporation Device 0982
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at f9000000 (32-bit, non-prefetchable)
        Memory at d0000000 (64-bit, prefetchable)
        Memory at ce000000 (64-bit, prefetchable)
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nvidia_346, nouveau, nvidiafb
...
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
        Subsystem: NVIDIA Corporation Device 0982
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at cc000000 (32-bit, non-prefetchable)
        Memory at b0000000 (64-bit, prefetchable)
        Memory at ae000000 (64-bit, prefetchable)
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nvidia_346, nouveau, nvidiafb

你们中的任何人都有一些信息可以帮助我进行故障排除,或者至少可以更好地了解问题出在哪里?像往常一样感谢您的阅读/帮助.-奥马尔(Omar)

Any of you have some information that could help me to troubleshoot or at least better understand where is the problem? Thanks as usual for reading/helping. -- Omar

推荐答案

当GPU通过套接字级链接(基于Intel的系统的QPI)互连时:

When GPUs are interconnected via a socket-level link (QPI for an Intel-based system):

GPU0     X  SOC 0-5,12-17
GPU1    SOC  X  6-11,18-23
        ^^^

然后在这两个GPU之间不可能进行P2P交易.

then P2P transactions are not possible between those 2 GPUs.

参与P2P的GPU有许多要求.其中之一是它们通常必须位于同一PCIE根联合体上.通过套接字级链接(例如QPI)连接的GPU位于两个不同的插槽"(即2个不同的CPU)上,因此它们属于两个不同的PCIE根联合体.

GPUs participating in P2P have a number of requirements placed on them. One of them is that they generally must be on the same PCIE root complex. GPUs that are connected via a socket-level link (e.g. QPI) are on two different "sockets" i.e. 2 different CPUs, and therefore they belong to two different PCIE root complexes.

请注意,通常,P2P支持可能因GPU或GPU系列而异.在一种GPU类型或GPU系列上运行P2P的能力并不一定表示它可以在另一种GPU类型或系列上运行,即使在相同的系统/设置中也是如此.GPU P2P支持的最终决定因素是所提供的可通过 cudaDeviceCanAccessPeer 查询运行时的工具.P2P支持可能会因系统和其他因素而异.在此发表的任何陈述都不能保证在任何特定设置下对任何特定GPU的P2P支持.

Note that in general, P2P support may vary by GPU or GPU family. The ability to run P2P on one GPU type or GPU family does not necessarily indicate it will work on another GPU type or family, even in the same system/setup. The final determinant of GPU P2P support are the tools provided that query the runtime via cudaDeviceCanAccessPeer. P2P support can vary by system and other factors as well. No statements made here are a guarantee of P2P support for any particular GPU in any particular setup.

这篇关于运行多GPU CUDA示例(SimpleP2P)时P2P内存访问失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆