对任意CUDA应用程序进行性能分析 [英] Profiling arbitrary CUDA applications

查看:75
本文介绍了对任意CUDA应用程序进行性能分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当然知道 nvvp nvprof 的存在,但是出于各种原因, nvprof 不想与我的应用一起使用涉及很多共享库. nvidia-smi 可以挂接到驱动程序中以了解正在运行的程序,但是我找不到让 nvprof 附加到正在运行的进程的好方法.

I know of the existence of nvvp and nvprof, of course, but for various reasons nvprof does not want to work with my app that involves lots of shared libraries. nvidia-smi can hook into the driver to find out what's running, but I cannot find a nice way to get nvprof to attach to a running process.

有一个标记-profile-all-processes ,它实际上确实给我一个消息"NVPROF正在分析过程12345",但是没有进一步打印出来.我正在使用CUDA 8.

There is a flag --profile-all-processes which does actually give me a message "NVPROF is profiling process 12345", but nothing further prints out. I am using CUDA 8.

在这种情况下如何获得CUDA内核的详细性能细分?

How can I get a detailed performance breakdown of my CUDA kernels in this situation?

推荐答案

正如评论所建议的,您只需要确保启动CUDA分析器(现在是NSight系统或NSight计算,不再是nvprof)之前您要分析的进程.例如,您可以将其配置为在系统启动时运行.

As comments suggest, you simply have to make sure to start the CUDA profiler (now it's NSight Systems or NSight Compute, no longer nvprof) before the processes you want to profile. You could, for example, configure it to run on system startup.

您无法描述您的应用程序与成为一个涉及大量共享库的应用程序"无关.-概要分析工具可以很好地描述此类应用程序.

Your inability to profile your application has nothing to do with it being an "app that involves lots of shared libraries" - the profiling tools profile such applications just fine.

这篇关于对任意CUDA应用程序进行性能分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆