将GPU计算的结果返回到OpenGL中的CPU程序 [英] Get results of GPU calculations back to the CPU program in OpenGL

查看:792
本文介绍了将GPU计算的结果返回到OpenGL中的CPU程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有办法将运行在GPU上的着色器的结果返回到运行在CPU上的程序?

Is there a way to get results from a shader running on a GPU back to the program running on the CPU?

我想基于GPU上的一种计算成本高昂的算法,根据简单体素数据生成多边形网格,但是我需要CPU上的结果才能进行物理计算.

I want to generate a polygon mesh from simple voxel data based on a computational costly algorithm on the GPU but I need the result on the CPU for physics calculations.

推荐答案

定义结果"吗?

通常,如果要使用OpenGL执行GPGPU风格的计算,则需要围绕渲染系统的需求来构造着色器.渲染系统被设计为单向的:将数据输入其中并生成图像.向后走,让渲染系统生成数据,通常不是渲染系统的结构方式.

In general, if you're doing GPGPU-style computations with OpenGL, you are going to need to structure your shaders around the needs of a rendering system. Rendering systems are designed to be one-way: data goes into them and an image is produced. Going backwards, having the rendering system produce data, is not generally how rendering systems are structured.

当然,这并不意味着您无法做到.但是您需要围绕OpenGL的限制来构建所有东西.

That doesn't mean you can't do it, of course. But you need to architect everything around the limitations of OpenGL.

OpenGL提供了许多挂钩,您可以在其中写入某些着色器阶段的数据.其中大多数需要专门的硬件

OpenGL offers a number of hooks where you can write data from certain shader stages. Most of these require specialized hardware

任何具有片段着色器功能的硬件显然都会允许您写入要渲染的当前帧缓冲区.通过使用帧缓冲对象和带浮点或整数 glGetTexImage 即可获取渲染的像素数据.或者,如果FBO仍被绑定,则可以执行 glReadPixels 来获取它.不管哪种方法.

Any hardware capable of fragment shaders will obviously allow you to write to the current framebuffer you're rendering. Through the use of framebuffer objects and textures with floating-point or integer image formats, you can write pretty much any data you want to a variety of images. Once in a texture, you can simply call glGetTexImage to get the rendered pixel data. Or you can just do glReadPixels to get it if the FBO is still bound. Either way works.

此方法的主要局限性是:

The primary limitations of this method are:

  • 可以附加到帧缓冲区的图像数;这限制了您可以写入的数据量.在GL 3.x之前的硬件上,FBO通常仅限于4张图像以及一个深度/模板缓冲区.在3.x和更好的硬件中,您至少可以期待8张图片.

  • The number of images you can attach to the framebuffer; this limits the amount of data you can write. On pre-GL 3.x hardware, FBOs were typically limited to only 4 images plus a depth/stencil buffer. In 3.x and better hardware, you can expect a minimum of 8 images.

您正在渲染 .这意味着您需要设置顶点数据以将三角形精确定位在您希望其修改数据的位置.这不是一件小事.获取有用的输入数据也很困难,因为您通常希望每个纹理像素都相对独立.围绕这些限制来构造片段着色器很困难.不是不可能,但在许多情况下是不平凡的.

The fact that you're rendering. This means that you need to set up your vertex data to position a triangle exactly where you want it to modify data. This is not a trivial undertaking. It's also difficult to get useful input data, since you typically want each texel to be fairly independent of the other. Structuring your fragment shader around these limitations is difficult. Not impossible, but non-trivial in many cases.

此OpenGL 3.0功能允许OpenGL的顶点处理阶段的输出(顶点着色器和可选的几何着色器)捕获在一个或多个缓冲区对象中.

This OpenGL 3.0 feature allows the output from the Vertex Processing stage of OpenGL (vertex shader and optional geometry shader) to be captured in one or more buffer objects.

对于捕获要播放的顶点数据或再次渲染的顶点数据,这自然得多.就您而言,您需要在渲染后重新读取它,可能是通过glGetBufferSubData调用,或者是使用glMapBufferRange进行读取.

This is much more natural for capturing vertex data that you want to play with or render again. In your case, you'll need to read it back after rendering it, perhaps with a glGetBufferSubData call, or by using glMapBufferRange for reading.

这里的限制是您通常只能捕获4个输出值,其中每个值都是vec4.还有一些严格的布局限制.某些OpenGL 3.x和4.x硬件提供了将数据写入多个反馈流的功能,这些反馈流都可以写入不同的缓冲区中.

The limitations here are that you generally only can capture 4 output values, where each value is a vec4. There are also some strict layout restrictions. Some OpenGL 3.x and 4.x hardware offers the ability to write data to multiple feedback streams, which can all be written into different buffers.

此GL 4.2功能代表了书写的顶峰:您可以绑定图像(如果要写入缓冲区,则为缓冲区纹理),然后直接写入即可.您需要在其中使用内存排序约束.

This GL 4.2 feature represents the pinnacle of writing: you can bind an image (a buffer texture, if you want to write to a buffer), and just write to it. There are memory ordering constraints that you need to work within.

它非常灵活,但是非常复杂.除了难以正确使用它之外,还有许多限制.您可以写入的图像数量将非常有限,大约为8张左右.而且实现可能有总的写限制,因此可能要由片段着色器的输出共享要写入的8张图像.

It's very flexible, but very complex. Besides the difficulty in using it properly, there are a number of limitations. The number of images you can write to will be fairly limited, perhaps 8 or so. And implementations may have total write limits, so that 8 images to write to may have to be shared by the fragment shader's outputs.

此外,仅为片段着色器(和4.3的计算着色器)保证图像输出.也就是说,允许硬件禁止您在非FS/CS着色器阶段使用图像加载/存储.

What's more, image outputs are only guaranteed for the fragment shader (and 4.3's compute shaders). That is, hardware is allowed to forbid you from using image load/store on non-FS/CS shader stages.

这篇关于将GPU计算的结果返回到OpenGL中的CPU程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆