cuda-零拷贝内存,内存映射文件 [英] cuda - Zero-copy memory, memory-mapped file

查看:216
本文介绍了cuda-零拷贝内存,内存映射文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创建一个包含uint32_t的映射内存文件,然后将其用作零复制固定内存,如下所示CUDA.获取设备指针,分配空间并从文件映射内存时得到cudaErrorInvalidValue.我知道错误消息(来自API)的意思是:

I am trying to create a mapped memory file, containing uint32_ts, and then use that as zero-copy pinned memory as shown below for CUDA. I am getting the cudaErrorInvalidValue when getting the device pointer, having allocated space and mapped the memory from file. I know the error message (from the API) means :

这表示传递给API调用的一个或多个参数不在可接受的值范围内.

This indicates that one or more of the parameters passed to the API call is not within an acceptable range of values.

但是我正在努力弄清楚为什么我会遇到这个问题....有什么想法吗?预先感谢.

But I'm struggling to figure out why I'm having this problem.... Any ideas? Thanks in advance.

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

    …

int main(void) 
{
  struct stat buf;

    …

  uint32_t *data, *dev_data;

  cudaDeviceProp cuda_prop;
  cudaGetDeviceProperties(&cuda_prop, 0);
  if (!cuda_prop.canMapHostMemory) 
    exit(EXIT_FAILURE);

  cudaSetDeviceFlags(cudaDeviceMapHost);


  int data_file = open(data_file_name, O_RDONLY);
  int stat = fstat(sa_file, &buf);
  int data_file_size = buf.st_size;

  err = cudaHostAlloc((void**)&data, data_file_size, cudaHostAllocMapped);
  if (err == cudaErrorMemoryAllocation) exit(EXIT_FAILURE);

  data = (uint32_t*) mmap(0, data_file_size, PROT_READ, MAP_PRIVATE, data_file, 0);

  err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);
  if (err == cudaErrorMemoryAllocation)
  {
    printf("cudaHostGetDevicePointer - Mem Alloc Err\n"); 
    exit(EXIT_FAILURE);
  }
  else if (err == cudaErrorInvalidValue) //ERROR HERE.
  {
    printf("cudaHostGetDevicePointer - Invalid Val Err\n"); 
    exit(EXIT_FAILURE);
  }

    …

}

推荐答案

一个问题是程序的逻辑顺序不正确.此行为CUDA API提供的data分配一个值:

One problem is that the logical sequence of your program is incorrect. This line assigns a value to data provided by the CUDA API:

err = cudaHostAlloc((void**)&data, data_file_size, cudaHostAllocMapped);

此行然后用一个新值覆盖该值:

data = (uint32_t*) mmap(0, data_file_size, PROT_READ, MAP_PRIVATE, data_file, 0);

那时,CUDA API不再将data的值识别为固定的存储空间,因此,当您调用此代码时:

At that point, the value of data is not recognized by the CUDA API as being a pinned memory space anymore, so when you call this:

err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);

您收到一个错误,因为data中包含的值未被识别.

you get an error, because the value contained in data is not recognized.

编辑 :(基于

(based on this question) Apart from that issue, it seems that if you change the file handling from read-only, to read-write, then this process can be made to work (throws no runtime errors). Here's a complete code (which doesn't contain the above logical flaw) that demonstrates this (I have previously created a test.dat file of size 566316 bytes):

$ cat t706.cu
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>

int main(void)
{
  struct stat buf;

  char *dev_data;

  cudaDeviceProp cuda_prop;
  cudaGetDeviceProperties(&cuda_prop, 0);
  if (!cuda_prop.canMapHostMemory)
    exit(EXIT_FAILURE);

  cudaSetDeviceFlags(cudaDeviceMapHost);


  int data_file = open("test.dat", O_RDWR);
  int stat = fstat(data_file, &buf);
  int data_file_size = buf.st_size;
  printf("data_file_size = %d\n", data_file_size);
  char *data = (char *) mmap(0, data_file_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, data_file, 0);
  if (data == NULL) {
    printf("mmap failure\n");
    exit(EXIT_FAILURE);}
  cudaError_t err = cudaHostRegister(data, data_file_size, cudaHostRegisterDefault);
  if (err != cudaSuccess) { //ERROR HERE.
    printf("cudaHostRegister fail\n");
    exit(EXIT_FAILURE);}

  err = cudaHostGetDevicePointer((void**)&dev_data, (void*)data, 0);
  if (err == cudaErrorMemoryAllocation)
  {
    printf("cudaHostGetDevicePointer - Mem Alloc Err\n");
    exit(EXIT_FAILURE);
  }
  else if (err == cudaErrorInvalidValue)
  {
    printf("cudaHostGetDevicePointer - Invalid Val Err\n");
    exit(EXIT_FAILURE);
  }

}
$ nvcc -arch=sm_30 -o t706 t706.cu
$ ./t706
data_file_size = 566316
$

这篇关于cuda-零拷贝内存,内存映射文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆