GPU卡在2秒后重置 [英] GPU card resets after 2 seconds

查看:277
本文介绍了GPU卡在2秒后重置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的NVIDIA geforce卡,如果我尝试运行一些CUDA程序,它会在2秒后出现错误。我阅读了此处,您可以使用 TDRlevel 键入 HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers 。但是,我在注册表中没有看到任何这样的键。是否需要自己添加?有人遇到这个问题。如果是,你是如何解决的?感谢。

I'm using an NVIDIA geforce card that gives an error after 2 seconds if I try to run some CUDA program on it. I read here that you can use the TDRlevel key in HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers. However, I don't see any such key in the registry. Does it needs to be added yourself? Have somebody else experienced this problem. If so, how did you solve it? Thanks.

推荐答案

我假设您使用的是Windows Vista或更新版本。

I'm assuming you are using Windows Vista or later.

链接到的文章包含注册表项列表控制Microsoft WDDM超时检测和恢复机制。正如talonmies所说,不是卡给出一个错误,它是微软Windows WDDM TDR机制,检测一个长时间运行的内核,并杀死它恢复GPU的显示用途。

The article you linked to contains a list of registry keys controlling the Microsoft WDDM Timeout Detection and Recovery mechanism. As talonmies commented, it is not the card giving an error it is the Microsoft Windows WDDM TDR mechanism that detects a long running kernel and kills it to recover the GPU for display purposes.

如果你有一个运行任何时间长度的内核,GPU被计算工作占用,不能更新你的显示,自然你可以想象大多数人会认为那个坏。一些开发人员选择增加延迟,以允许开发更长的运行内核,理解他们的系统可能会变得无响应几秒钟。如果您使用带有WDDM GPU的调试器(NVIDIA Tesla GPU支持 TCC ,避免所有的WDDM头痛)。

If you have a kernel that runs for any length of time then the GPU is occupied with the compute work and cannot update your display, naturally you can imagine that most people would consider that bad. Some developers chose to increase the delay to allow developing longer running kernels, with the understanding that their system may become unresponsive for a few seconds. You may also have to disable the TDR if you are using a debugger with a WDDM GPU (NVIDIA Tesla GPUs support TCC which avoids all the WDDM headaches).

如果键不存在,你应该创建它们。我建议:

If the keys do not exist you should create them. I would suggest:


  • TdrLevel 3(即启用)

  • TdrDelay 5(即5秒)

  • TdrLimitTime 10

  • TdrLimitCount 10(即10秒内最多10次超时)

  • TdrLevel 3 (i.e. enabled)
  • TdrDelay 5 (i.e. 5 seconds)
  • TdrLimitTime 10
  • TdrLimitCount 10 (i.e. max 10 timeouts in 10 seconds)

替代方法是使用第二个GPU来执行或调整您的问题集,以确保内核时间小于2秒 - 真正大的问题应该在专用GPU上运行。这假设它不是你的内核中的错误,当然!

Alternatives are to use a second GPU for execution or to adjust your problem set to ensure the kernel time is less than 2 seconds - really big problems should be run on a dedicated GPU. That assumes it's not a bug in your kernel, of course!

这篇关于GPU卡在2秒后重置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆