pyCuda,发送多个单变量参数时出现问题 [英] pyCuda, issues sending multiple single variable arguments

查看:68
本文介绍了pyCuda,发送多个单变量参数时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我这里有一个pycuda程序,该程序从命令行读取图像并以反转的颜色保存回版本:

I have a pycuda program here that reads in an image from the command line and saves a version back with the colors inverted:

import pycuda.autoinit
import pycuda.driver as device
from pycuda.compiler import SourceModule as cpp

import numpy as np
import sys
import cv2

modify_image = cpp("""
__global__ void modify_image(int pixelcount, unsigned char* inputimage, unsigned char* outputimage)
{
  int id = threadIdx.x + blockIdx.x * blockDim.x;
  if (id >= pixelcount)
    return;

  outputimage[id] = 255 - inputimage[id];
}
""").get_function("modify_image")

print("Loading image")

image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED).astype(np.uint8)

print("Processing image")

pixels = image.shape[0] * image.shape[1]
newchannels = []
for channel in cv2.split(image):
  output = np.zeros_like(channel)
  modify_image(
    device.In(np.int32(pixels)),
    device.In(channel),
    device.Out(output),
    block=(1024,1,1), grid=(pixels // 1024 + 1, 1))
  newchannels.append(output)
finalimage = cv2.merge(newchannels)

print("Saving image")

cv2.imwrite("processed.png", finalimage)

print("Done")

即使在较大的图像上,它也可以正常工作.但是,在尝试扩展程序的功能时,我遇到了一个非常奇怪的问题,其中在内核中添加第二个变量参数会导致程序完全失败,只是保存了全黑图像.以下代码不起作用;

It works perfectly fine, even on larger images. However, in trying to expand the functionality of the program, I came across a really strange issue wherein adding a second variable argument to the kernel causes the program to completely fail, simply saving a completely black image. The following code does not work;

import pycuda.autoinit
import pycuda.driver as device
from pycuda.compiler import SourceModule as cpp

import numpy as np
import sys
import cv2

modify_image = cpp("""
__global__ void modify_image(int pixelcount, int width, unsigned char* inputimage, unsigned char* outputimage)
{
  int id = threadIdx.x + blockIdx.x * blockDim.x;
  if (id >= pixelcount)
    return;

  outputimage[id] = 255 - inputimage[id];
}
""").get_function("modify_image")

print("Loading image")

image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED).astype(np.uint8)

print("Processing image")

pixels = image.shape[0] * image.shape[1]
newchannels = []
for channel in cv2.split(image):
  output = np.zeros_like(channel)
  modify_image(
    device.In(np.int32(pixels)),
    device.In(np.int32(image.shape[0])),
    device.In(channel),
    device.Out(output),
    block=(1024,1,1), grid=(pixels // 1024 + 1, 1))
  newchannels.append(output)
finalimage = cv2.merge(newchannels)

print("Saving image")

cv2.imwrite("processed.png", finalimage)

print("Done")

唯一的区别是两行,即内核标头和调用.内核本身的实际代码没有改变,但是这一小小的改动完全破坏了程序.编译器和解释器都不会抛出任何错误.我不知道如何开始调试它,对此感到非常困惑.

where the only difference is on two lines, the kernel header and it's call. The actual code of the kernel itself is unchanged, and yet this small addition completely breaks the program. Neither the compiler nor interpreter throw any errors. I have no idea how to begin to debug it, and am thoroughly confused.

推荐答案

device.In 和亲戚旨在与支持Python缓冲区协议(例如numpy数组)的对象一起使用.问题的根源是使用它们来传输非缓冲对象.

The device.In and relatives are designed for use with objects which support the Python buffer protocols (like numpy arrays). The source of your problem is using them to transfer non-buffer objects.

只需将具有正确numpy dtype的标量直接传递到内核调用即可.不要使用 device.In .在原始案例中有效的事实是一次完全的事故

Just pass your scalars with the correct numpy dtype directly to your kernel call. Don't use device.In. The fact this worked in the original case was a complete accident

这篇关于pyCuda,发送多个单变量参数时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆