检测推力变换的ptx核 [英] Detecting ptx kernel of Thrust transform

查看:24
本文介绍了检测推力变换的ptx核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下推力::转换调用.

my_functor *f_1 = new my_functor();推力::转换(data.begin(),data.end(),data.begin(),* f_1);

我想在 PTX 文件中检测它对应的内核.但是有很多内核在它们的名称中包含 my_functor.

例如-

<预> <代码> _ZN6thrust6system4cuda6detail6detail23launch_closure_by_valueINS2_17for_each_n_detail18for_each_n_closureINS_12zip_iteratorINS_5tupleINS_6detail15normal_iteratorINS_10device_ptrIiEEEESD_NS_9null_typeESE_SE_SE_SE_SE_SE_SE_EEEEjNS9_30device_unary_transform_functorI10my_functorEENS3_20blocked_thread_arrayEEEEEvT__ZN6thrust6system4cuda6detail6detail23launch_closure_by_valueINS2_17for_each_n_detail18for_each_n_closureINS_12zip_iteratorINS_5tupleINS_6detail15normal_iteratorINS_10device_ptrIiEEEESD_NS_9null_typeESE_SE_SE_SE_SE_SE_SE_EEEElNS9_30device_unary_transform_functorI10my_functorEENS3_20blocked_thread_arrayEEEEEvT__ZN6thrust6detail15device_functionINS0_30device_unary_transform_functorI10my_functorEEvEC1ERKS4_

启动了哪个内核,这些其他内核是什么?

解决方案

如果您使用的是 Visual Studio,请使用 CUDA Toolkit 附带的 Nvidia NSIGHT Visual Studio Edition.

进入Nsight"菜单,点击Start Performance Analysis..."条目.

  • 在Activity type"中,选择Profile CUDA Application"
  • 在实验设置"中,勾选为 CUDA 源视图收集信息"
  • 在要运行的实验"列表框中选择全部"
  • 在Capture Control"中,勾选Open Report on Stop"并在列表框中选择CUDA Source View"

然后,单击启动"并等待您的应用程序完全执行.您将在 Nsight 的控制台中看到额外的输出.

执行后,会打开CUDA Source View"窗口.- 在视图"列表框中选择源和 PTX"您将能够找到源代码和生成的 PTX 之间的对应关系.当您单击源代码中的一行时,PTX 代码中的一行或多行会以绿色突出显示.

I have following thrust::transform call.

my_functor *f_1 = new my_functor();
thrust::transform(data.begin(), data.end(), data.begin(),*f_1);

I want to detect it's corresponding kernel in PTX file. But there are many kernels containing my_functor in their mangled names.

For example-

_ZN6thrust6system4cuda6detail6detail23launch_closure_by_valueINS2_17for_each_n_detail18for_each_n_closureINS_12zip_iteratorINS_5tupleINS_6detail15normal_iteratorINS_10device_ptrIiEEEESD_NS_9null_typeESE_SE_SE_SE_SE_SE_SE_EEEEjNS9_30device_unary_transform_functorI10my_functorEENS3_20blocked_thread_arrayEEEEEvT_

_ZN6thrust6system4cuda6detail6detail23launch_closure_by_valueINS2_17for_each_n_detail18for_each_n_closureINS_12zip_iteratorINS_5tupleINS_6detail15normal_iteratorINS_10device_ptrIiEEEESD_NS_9null_typeESE_SE_SE_SE_SE_SE_SE_EEEElNS9_30device_unary_transform_functorI10my_functorEENS3_20blocked_thread_arrayEEEEEvT_

_ZN6thrust6detail15device_functionINS0_30device_unary_transform_functorI10my_functorEEvEC1ERKS4_

Which kernel is launched and what are these other kernels?

解决方案

If you are using Visual Studio, use Nvidia NSIGHT Visual Studio Edition which comes with the CUDA Toolkit.

Go to the "Nsight" menu, click on the "Start Performance Analysis..." entry.

  • In "Activity type", select "Profile CUDA Application"
  • In "Experiment settings", tick "Collect Information for CUDA Source View"
  • Choose "All" in the "Experiments to Run" listbox
  • In "Capture Control", tick "Open Report on Stop" and select "CUDA Source View" in the listbox

Then, click on "Launch" and wait for your application to be fully executed. You will see additional output in the console from Nsight.

After the execution, the "CUDA Source View" window will open. - Select "Source and PTX" in the "View" listbox You will be able to find the correspondance between source code and generated PTX. When you click on a line in the source code, one or more lines are highlighted in green in the PTX code.

这篇关于检测推力变换的ptx核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆