如何将CUDA时钟周期转换为毫秒? [英] How to convert CUDA clock cycles to milliseconds?

查看:151
本文介绍了如何将CUDA时钟周期转换为毫秒?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在内核中的内花费一些时间。我已经关注了这个问题及其注释,以便我的内核看起来像像这样:

I'd like to measure the time a bit of code within my kernel takes. I've followed this question along with its comments so that my kernel looks something like this:

__global__ void kernel(..., long long int *runtime)
{
    long long int start = 0; 
    long long int stop = 0;

    asm volatile("mov.u64 %0, %%clock64;" : "=l"(start));

    /* Some code here */

    asm volatile("mov.u64 %0, %%clock64;" : "=l"(stop));

    runtime[threadIdx.x] = stop - start;
    ...
}

答案回答如下: :


计时器对时钟滴答数进行计数。要获得毫秒数,请将其除以设备上的GHz数再乘以1000。

The timers count the number of clock ticks. To get the number of milliseconds, divide this by the number of GHz on your device and multiply by 1000.

我为此:

for(long i = 0; i < size; i++)
{
  fprintf(stdout, "%d:%ld=%f(ms)\n", i,runtime[i], (runtime[i]/1.62)*1000.0);
}

其中1.62是我设备的GPU最大时钟频率。但是以毫秒为单位的时间看起来并不正确,因为这表明每个线程需要几分钟才能完成。由于执行时间不到挂钟时间的一秒,因此这是不正确的。转换公式不正确还是我在某个地方犯了错误?谢谢。

Where 1.62 is the GPU Max Clock rate of my device. But the time I get in milliseconds does not look right because it suggests that each thread took minutes to complete. This cannot be correct as execution finishes in less than a second of wall-clock time. Is the conversion formula incorrect or am I making a mistake somewhere? Thanks.

推荐答案

在您的情况下,正确的转换不是GHz:

The correct conversion in your case is not GHz:

fprintf(stdout, "%d:%ld=%f(ms)\n", i,runtime[i], (runtime[i]/1.62)*1000.0);
                                                             ^^^^

但赫兹:

fprintf(stdout, "%d:%ld=%f(ms)\n", i,runtime[i], (runtime[i]/1620000000.0f)*1000.0);
                                                             ^^^^^^^^^^^^^

在维度分析中:

                  clock cycles
clock cycles  /  -------------- = seconds
                   second

第一项是时钟周期测量。第二项是GPU的频率(以赫兹为单位,而不是GHz),第三项是所需的测量值(秒)。您可以通过将秒乘以1000来转换为毫秒。

the first term is the clock cycle measurement. The second term is the frequency of the GPU (in hertz, not GHz), the third term is the desired measurement (seconds). You can convert to milliseconds by multiplying seconds by 1000.

这是一个有效的示例,它显示了与设备无关的方式(因此,您不必

Here's a worked example that shows a device-independent way to do it (so you don't have to hard-code the clock frequency):

$ cat t1306.cu
#include <stdio.h>

const long long delay_time = 1000000000;
const int nthr = 1;
const int nTPB = 256;

__global__ void kernel(long long *clocks){

  int idx=threadIdx.x+blockDim.x*blockIdx.x;
  long long start=clock64();
  while (clock64() < start+delay_time);
  if (idx < nthr) clocks[idx] = clock64()-start;
}

int main(){

  int peak_clk = 1;
  int device = 0;
  long long *clock_data;
  long long *host_data;
  host_data = (long long *)malloc(nthr*sizeof(long long));
  cudaError_t err = cudaDeviceGetAttribute(&peak_clk, cudaDevAttrClockRate, device);
  if (err != cudaSuccess) {printf("cuda err: %d at line %d\n", (int)err, __LINE__); return 1;}
  err = cudaMalloc(&clock_data, nthr*sizeof(long long));
  if (err != cudaSuccess) {printf("cuda err: %d at line %d\n", (int)err, __LINE__); return 1;}
  kernel<<<(nthr+nTPB-1)/nTPB, nTPB>>>(clock_data);
  err = cudaMemcpy(host_data, clock_data, nthr*sizeof(long long), cudaMemcpyDeviceToHost);
  if (err != cudaSuccess) {printf("cuda err: %d at line %d\n", (int)err, __LINE__); return 1;}
  printf("delay clock cycles: %ld, measured clock cycles: %ld, peak clock rate: %dkHz, elapsed time: %fms\n", delay_time, host_data[0], peak_clk, host_data[0]/(float)peak_clk);
  return 0;
}
$ nvcc -arch=sm_35 -o t1306 t1306.cu
$ ./t1306
delay clock cycles: 1000000000, measured clock cycles: 1000000210, peak clock rate: 732000kHz, elapsed time: 1366.120483ms
$

此命令使用 cudaDeviceGetAttribute 即可获取时钟速率,返回的结果以kHz为单位,在这种情况下,我们可以轻松计算毫秒。

This uses cudaDeviceGetAttribute to get the clock rate, which returns a result in kHz, which allows us to easily compute milliseconds in this case.

这篇关于如何将CUDA时钟周期转换为毫秒?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆