如何将Metal Performance Shader与MTLBlitCommandEncoder同步? [英] How do you synchronize a Metal Performance Shader with an MTLBlitCommandEncoder?

查看:372
本文介绍了如何将Metal Performance Shader与MTLBlitCommandEncoder同步?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试使用Metal Performance ShadersMTLBlitCommandEncoder时,我试图更好地理解同步要求.

I'm trying to better understand the synchronization requirements when working with Metal Performance Shaders and an MTLBlitCommandEncoder.

我有一个MTLCommandBuffer,其设置如下:

I have an MTLCommandBuffer that is set up as follows:

  • 使用MTLBlitCommandEncoder将纹理A的一个区域复制到纹理B.纹理A大于纹理B.我从纹理A中提取一个平铺"并将其复制到纹理B中.

  • Use MTLBlitCommandEncoder to copy a region of Texture A into Texture B. Texture A is larger than Texture B. I'm extracting a "tile" from Texture A and copying it into Texture B.

使用MPSImageBilinearScale金属性能着色器,将Texture B作为源纹理,将第三个纹理Texture C作为目标纹理.这种金属性能着色器将缩放并可能将纹理B的内容转换为纹理C.

Use an MPSImageBilinearScale metal performance shader with Texture B as the source texture and a third texture, Texture C, as the destination. This metal performance shader will scale and potentially translate the contents of Texture B into Texture C.

在金属性能着色器开始尝试缩放纹理B之前,如何确保blit编码器完全完成了从纹理A到纹理B的数据复制?我什至不必担心这个问题,还是命令缓冲区的串行特性已经为我解决了这个问题?

How do I ensure that the blit encoder completely finishes copying the data from Texture A to Texture B before the metal performance shader starts trying to scale Texture B? Do I even have to worry about this or does the serial nature of a command buffer take care of this for me already?

金属具有使用MTLFence进行同步访问资源的围栅的概念,但是我仍然看不到要在围栅上等待金属性能着色器. (而waitForFence:出现在编码器上.)

Metal has the concept of fences using MTLFence for synchronizing access to resources, but I don't see anyway to have a metal performance shader wait on a fence. (Whereas waitForFence: is present on the encoders.)

如果我不能使用围栅并且需要同步,是否建议仅将blit编码器排入队列,然后在将着色器排入队列并再次调用waitUntilCompleted之前在命令缓冲区上调用waitUntilCompleted?例如:

If I can't use fences and I do need to synchronize, is the recommended practice to just enqueue the blit encoder, then call waitUntilCompleted on the command buffer before enqueue the shader and calling waitUntilCompleted a second time? ex:

id<MTLCommandBuffer> commandBuffer;

// Enqueue blit encoder to copy Texture A -> Texture B
id<MTLBlitCommandEncoder> blitEncoder = [commandBuffer blitCommandEncoder];
[blitEncoder copyFromTexture:...];
[blitEncoder endEncoding];

// Wait for blit encoder to complete.
[commandBuffer commit];
[commandBuffer waitUntilCompleted];

// Scale Texture B -> Texture C
MPSImageBilinearScale *imageScaleShader = [[MPSImageBilinearScale alloc] initWithDevice:...];  
[imageScaleShader encodeToCommandBuffer:commandBuffer...];

// Wait for scaling shader to complete.
[commandBuffer commit];
[commandBuffer waitUntilCompleted];

我认为我需要将中间副本复制到Texture B中的原因是因为MPSImageBilinearScale似乎可以缩放其整个源纹理. clipOffset对于输出很有用,但不适用于实际的缩放或变换.因此需要将图块从纹理A提取到与图块本身大小相同的纹理B中.然后缩放和变换将是有意义的".忽略此脚注,因为我忘记了一些基本的数学原理,并且从那时起就想出了如何使clip Transcect可以使用缩放变换的转换属性.

The reason I think I need to do the intermediary copy into Texture B is because MPSImageBilinearScale appears to scale its entire source texture. The clipOffset is useful for output, but it doesn't apply to the actual scaling or transform. So the tile needs to be extracted from Texture A into Texture B that is the same size as the tile itself. Then the scaling and transform will "make sense". Disregard this footnote because I had forgotten some basic math principles and have since figured out how to make the scale transform's translate properties work with the clipRect.

推荐答案

Metal为您解决了这个问题.驱动程序和GPU以串行方式在命令缓冲区中执行命令. (虽然"允许并行运行或无序运行,以提高效率,但前提是结果与串行完成的结果相同.)

Metal takes care of this for you. The driver and GPU execute commands in a command buffer as though in serial fashion. (The "as though" allows for running things in parallel or out of order for efficiency, but only if the result would be the same as when done serially.)

当CPU和GPU都使用相同的对象时,会出现同步问题.还可以在屏幕上呈现纹理. (您不应该渲染到在屏幕上呈现的纹理.)

Synchronization issues arise when both the CPU and GPU are working with the same objects. Also with presenting textures on-screen. (You shouldn't be rendering to a texture that's being presented on screen.)

有一个

There's a section of the Metal Programming Guide which deals with read-write access to resources by shaders, which is not exactly the same, but should reassure you:

内存障碍

在命令编码器之间

在给定的命令编码器中执行的所有资源写入都是可见的 在下一个命令编码器中.渲染和计算均是如此 命令编码器.

All resource writes performed in a given command encoder are visible in the next command encoder. This is true for both render and compute command encoders.

在渲染命令编码器内

对于缓冲区,原子写入对于后续的原子读取可见 跨多个线程.

For buffers, atomic writes are visible to subsequent atomic reads across multiple threads.

对于纹理,textureBarrier方法可确保写入 在给定的绘图调用中执行的操作对于后续读取操作可见 下一次抽奖.

For textures, the textureBarrier method ensures that writes performed in a given draw call are visible to subsequent reads in the next draw call.

在Compute Command Encoder中

在给定的内核函数中执行的所有资源写操作都是可见的 在下一个内核函数中.

All resource writes performed in a given kernel function are visible in the next kernel function.

这篇关于如何将Metal Performance Shader与MTLBlitCommandEncoder同步?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆