nvprof没有拾取任何API调用或内核 [英] nvprof not picking up any API calls or kernels

查看:380
本文介绍了nvprof没有拾取任何API调用或内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用nvprof在CUDA程序中获得一些基准测试时间,但不幸的是,它似乎并未分析任何API调用或内核。我寻找了一个简单的初学者示例,以确保自己做得对,并在Nvidia开发博客上找到了一个示例:

I'm trying to get some benchmark timings in my CUDA program with nvprof but unfortunately it doesn't seem to be profiling any API calls or kernels. I looked for a simple beginners example to make sure I was doing it right and found one on the Nvidia dev blogs here:

https://devblogs.nvidia.com/parallelforall/how-optimize-data-transfers-cuda-cc/

代码:

int main()
{
    const unsigned int N = 1048576;
    const unsigned int bytes = N * sizeof(int);
    int *h_a = (int*)malloc(bytes);
    int *d_a;
    cudaMalloc((int**)&d_a, bytes);

    memset(h_a, 0, bytes);
    cudaMemcpy(d_a, h_a, bytes, cudaMemcpyHostToDevice);
    cudaMemcpy(h_a, d_a, bytes, cudaMemcpyDeviceToHost);

    return 0;
}

命令行:

-bash-4.2$ nvcc profile.cu -o profile_test
-bash-4.2$ nvprof ./profile_test

所以我逐字逐句地复制了它,并运行了相同的命令行参数。不幸的是,我的结果是相同的:

So I replicated it word for word, line by line, and ran identical command line arguments. Unfortunately my result was the same:

-bash-4.2$ nvprof ./profile_test
==85454== NVPROF is profiling process 85454, command: ./profile_test
==85454== Profiling application: ./profile_test
==85454== Profiling result:
No kernels were profiled.

==85454== API calls:
No API activities were profiled. 

我正在运行Nvidia工具包7.5

I am running Nvidia toolkit 7.5

如果有人知道我在做什么错,我将不胜感激知道答案。

If anyone knows what what I'm doing wrong I'd be grateful to know the answer.

----- EDIT - ---

-----EDIT-----

所以我将代码修改为

#include<cuda_profiler_api.h>

int main()
{
    cudaProfilerStart();
    const unsigned int N = 1048576;
    const unsigned int bytes = N * sizeof(int);
    int *h_a = (int*)malloc(bytes);
    int *d_a;
    cudaMalloc((int**)&d_a, bytes);

    memset(h_a, 0, bytes);
    cudaMemcpy(d_a, h_a, bytes, cudaMemcpyHostToDevice);
    cudaMemcpy(h_a, d_a, bytes, cudaMemcpyDeviceToHost);

    cudaProfilerStop();
    return 0;
}

不幸的是,它没有改变。

Unfortunately it did not change things.

推荐答案

这是统一内存分析的错误,标志

It's a bug with unified memory profiling, the flag

--unified-memory-profiling off  ./profile_test

为我解决了所有问题。

resolves all problems for me.

这篇关于nvprof没有拾取任何API调用或内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆