将GPU计算的结果返回给OpenGL中的CPU程序 [英] Get results of GPU calculations back to the CPU program in OpenGL

查看:23
本文介绍了将GPU计算的结果返回给OpenGL中的CPU程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法将在 GPU 上运行的着色器的结果返回给在 CPU 上运行的程序?

Is there a way to get results from a shader running on a GPU back to the program running on the CPU?

我想根据 GPU 上的计算成本高的算法从简单的体素数据生成多边形网格,但我需要 CPU 上的结果进行物理计算.

I want to generate a polygon mesh from simple voxel data based on a computational costly algorithm on the GPU but I need the result on the CPU for physics calculations.

推荐答案

定义结果"?

一般来说,如果您使用 OpenGL 进行 GPGPU 风格的计算,您将需要围绕渲染系统的需求构建着色器.渲染系统设计为单向:数据进入其中并生成图像.反过来说,让渲染系统产生数据,通常不是渲染系统的结构.

In general, if you're doing GPGPU-style computations with OpenGL, you are going to need to structure your shaders around the needs of a rendering system. Rendering systems are designed to be one-way: data goes into them and an image is produced. Going backwards, having the rendering system produce data, is not generally how rendering systems are structured.

当然,这并不意味着您不能这样做.但是您需要围绕 OpenGL 的局限性构建一切.

That doesn't mean you can't do it, of course. But you need to architect everything around the limitations of OpenGL.

OpenGL 提供了许多钩子,您可以在其中写入来自特定着色器阶段的数据.其中大部分需要专门的硬件

OpenGL offers a number of hooks where you can write data from certain shader stages. Most of these require specialized hardware

任何支持片段着色器的硬件显然都允许您写入正在渲染的当前帧缓冲区.通过使用 framebuffer 对象 和带有浮点或整数的纹理 图像格式,您几乎可以将任何数据写入各种图像.进入纹理后,您只需调用 glGetTexImage 即可获得渲染的像素数据.或者你可以做 glReadPixels 来获取它,如果 FBO还是绑定.无论哪种方式都有效.

Any hardware capable of fragment shaders will obviously allow you to write to the current framebuffer you're rendering. Through the use of framebuffer objects and textures with floating-point or integer image formats, you can write pretty much any data you want to a variety of images. Once in a texture, you can simply call glGetTexImage to get the rendered pixel data. Or you can just do glReadPixels to get it if the FBO is still bound. Either way works.

这种方法的主要限制是:

The primary limitations of this method are:

  • 可以附加到帧缓冲区的图像数量;这限制了您可以写入的数据量.在 GL 3.x 之前的硬件上,FBO 通常仅限于 4 个图像和一个深度/模板缓冲区.在 3.x 和更好的硬件中,您至少可以看到 8 个图像.

  • The number of images you can attach to the framebuffer; this limits the amount of data you can write. On pre-GL 3.x hardware, FBOs were typically limited to only 4 images plus a depth/stencil buffer. In 3.x and better hardware, you can expect a minimum of 8 images.

您正在渲染这一事实.这意味着您需要设置顶点数据以将三角形准确定位在您希望它修改数据的位置.这不是一项微不足道的事业.获得有用的输入数据也很困难,因为您通常希望每个纹素相互独立.围绕这些限制构建片段着色器是很困难的.并非不可能,但在许多情况下并非微不足道.

The fact that you're rendering. This means that you need to set up your vertex data to position a triangle exactly where you want it to modify data. This is not a trivial undertaking. It's also difficult to get useful input data, since you typically want each texel to be fairly independent of the other. Structuring your fragment shader around these limitations is difficult. Not impossible, but non-trivial in many cases.

此 OpenGL 3.0 功能允许来自 OpenGL 的顶点处理阶段的输出(顶点着色器和可选的几何着色器)在一个或多个缓冲区对象中捕获.

This OpenGL 3.0 feature allows the output from the Vertex Processing stage of OpenGL (vertex shader and optional geometry shader) to be captured in one or more buffer objects.

这对于捕获您想要播放或再次渲染的顶点数据要自然得多.在您的情况下,您需要在渲染后读取它,可能使用 glGetBufferSubData 调用,或使用 glMapBufferRange 进行读取.

This is much more natural for capturing vertex data that you want to play with or render again. In your case, you'll need to read it back after rendering it, perhaps with a glGetBufferSubData call, or by using glMapBufferRange for reading.

这里的限制是您通常只能捕获 4 个输出值,其中每个值都是一个 vec4.还有一些严格的布局限制.一些 OpenGL 3.x 和 4.x 硬件提供将数据写入多个反馈流的能力,这些流都可以写入不同的缓冲区.

The limitations here are that you generally only can capture 4 output values, where each value is a vec4. There are also some strict layout restrictions. Some OpenGL 3.x and 4.x hardware offers the ability to write data to multiple feedback streams, which can all be written into different buffers.

GL 4.2 的这个特性代表了写入的顶峰:你可以绑定一个图像(一个缓冲区纹理,如果你想写入一个缓冲区),然后直接写入它.您需要处理内存排序约束.

This GL 4.2 feature represents the pinnacle of writing: you can bind an image (a buffer texture, if you want to write to a buffer), and just write to it. There are memory ordering constraints that you need to work within.

它非常灵活,但非常复杂.除了难以正确使用它之外,还有许多限制.您可以写入的图像数量相当有限,可能只有 8 个左右.并且实现可能有总写入限制,因此片段着色器的输出可能必须共享要写入的 8 个图像.

It's very flexible, but very complex. Besides the difficulty in using it properly, there are a number of limitations. The number of images you can write to will be fairly limited, perhaps 8 or so. And implementations may have total write limits, so that 8 images to write to may have to be shared by the fragment shader's outputs.

此外,只有片段着色器(以及 4.3 的计算着色器)才能保证图像输出.也就是说,允许硬件禁止您在非 FS/CS 着色器阶段使用图像加载/存储.

What's more, image outputs are only guaranteed for the fragment shader (and 4.3's compute shaders). That is, hardware is allowed to forbid you from using image load/store on non-FS/CS shader stages.

这篇关于将GPU计算的结果返回给OpenGL中的CPU程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆