cuda - 零拷贝内存,内存映射文件 [英] cuda - Zero-copy memory, memory-mapped file

查看:19
本文介绍了cuda - 零拷贝内存,内存映射文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个包含 uint32_ts 的映射内存文件,然后将其用作 CUDA 的零拷贝固定内存,如下所示.我在获取设备指针、分配空间并从文件映射内存时得到 cudaErrorInvalidValue.我知道错误消息(来自 API)意味着:

I am trying to create a mapped memory file, containing uint32_ts, and then use that as zero-copy pinned memory as shown below for CUDA. I am getting the cudaErrorInvalidValue when getting the device pointer, having allocated space and mapped the memory from file. I know the error message (from the API) means :

这表明传递给 API 调用的一个或多个参数不在可接受的值范围内.

This indicates that one or more of the parameters passed to the API call is not within an acceptable range of values.

但我很难弄清楚为什么我会遇到这个问题......有什么想法吗?提前致谢.

But I'm struggling to figure out why I'm having this problem.... Any ideas? Thanks in advance.

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

    …

int main(void) 
{
  struct stat buf;

    …

  uint32_t *data, *dev_data;

  cudaDeviceProp cuda_prop;
  cudaGetDeviceProperties(&cuda_prop, 0);
  if (!cuda_prop.canMapHostMemory) 
    exit(EXIT_FAILURE);

  cudaSetDeviceFlags(cudaDeviceMapHost);


  int data_file = open(data_file_name, O_RDONLY);
  int stat = fstat(sa_file, &buf);
  int data_file_size = buf.st_size;

  err = cudaHostAlloc((void**)&data, data_file_size, cudaHostAllocMapped);
  if (err == cudaErrorMemoryAllocation) exit(EXIT_FAILURE);

  data = (uint32_t*) mmap(0, data_file_size, PROT_READ, MAP_PRIVATE, data_file, 0);

  err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);
  if (err == cudaErrorMemoryAllocation)
  {
    printf("cudaHostGetDevicePointer - Mem Alloc Err
"); 
    exit(EXIT_FAILURE);
  }
  else if (err == cudaErrorInvalidValue) //ERROR HERE.
  {
    printf("cudaHostGetDevicePointer - Invalid Val Err
"); 
    exit(EXIT_FAILURE);
  }

    …

}

推荐答案

一个问题是你程序的逻辑顺序不对.此行为 CUDA API 提供的 data 赋值:

One problem is that the logical sequence of your program is incorrect. This line assigns a value to data provided by the CUDA API:

err = cudaHostAlloc((void**)&data, data_file_size, cudaHostAllocMapped);

这一行然后覆盖那个值,用一个新值:

This line then overwrites that value, with a new one:

data = (uint32_t*) mmap(0, data_file_size, PROT_READ, MAP_PRIVATE, data_file, 0);

此时,data 的值不再被 CUDA API 识别为固定内存空间,所以当你调用这个时:

At that point, the value of data is not recognized by the CUDA API as being a pinned memory space anymore, so when you call this:

err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);

您收到错误,因为 data 中包含的值无法识别.

you get an error, because the value contained in data is not recognized.

(基于 这个问题)除了这个问题,似乎如果您将文件处理从只读更改为读写,则可以使此过程正常工作(不会引发运行时错误).这是一个完整的代码(不包含上述逻辑缺陷),演示了这一点(我之前创建了一个大小为 566316 字节的 test.dat 文件):

(based on this question) Apart from that issue, it seems that if you change the file handling from read-only, to read-write, then this process can be made to work (throws no runtime errors). Here's a complete code (which doesn't contain the above logical flaw) that demonstrates this (I have previously created a test.dat file of size 566316 bytes):

$ cat t706.cu
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>

int main(void)
{
  struct stat buf;

  char *dev_data;

  cudaDeviceProp cuda_prop;
  cudaGetDeviceProperties(&cuda_prop, 0);
  if (!cuda_prop.canMapHostMemory)
    exit(EXIT_FAILURE);

  cudaSetDeviceFlags(cudaDeviceMapHost);


  int data_file = open("test.dat", O_RDWR);
  int stat = fstat(data_file, &buf);
  int data_file_size = buf.st_size;
  printf("data_file_size = %d
", data_file_size);
  char *data = (char *) mmap(0, data_file_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, data_file, 0);
  if (data == MAP_FAILED) {
    printf("mmap failure
");
    exit(EXIT_FAILURE);}
  cudaError_t err = cudaHostRegister(data, data_file_size, cudaHostRegisterDefault);
  if (err != cudaSuccess) { //ERROR HERE.
    printf("cudaHostRegister fail
");
    exit(EXIT_FAILURE);}

  err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);
  if (err == cudaErrorMemoryAllocation)
  {
    printf("cudaHostGetDevicePointer - Mem Alloc Err
");
    exit(EXIT_FAILURE);
  }
  else if (err == cudaErrorInvalidValue)
  {
    printf("cudaHostGetDevicePointer - Invalid Val Err
");
    exit(EXIT_FAILURE);
  }

}
$ nvcc -arch=sm_30 -o t706 t706.cu
$ ./t706
data_file_size = 566316
$

这篇关于cuda - 零拷贝内存,内存映射文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆