Android上的零拷贝相机处理和渲染管道 [英] Zero-copy Camera Processing and Rendering Pipeline on Android

查看:116
本文介绍了Android上的零拷贝相机处理和渲染管道的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对实时摄像机数据(仅来自Y平面)执行CPU端只读过程,然后在GPU上进行渲染.在处理完成之前,不应该渲染帧(因此,我并不总是希望从摄像机渲染最新的帧,而只是想渲染CPU端已完成处理的最新帧).渲染与摄影机处理是分离的,即使摄影机帧到达的速率低于该速率,目标还是要达到60 FPS.

I need to do a CPU-side read-only process on live camera data (from just the Y plane) followed by rendering it on the GPU. Frames shouldn't be rendered until processing completes (so I don't always want to render the latest frame from the camera, just the latest one that the CPU-side has finished processing). Rendering is decoupled from the camera processing and aims for 60 FPS even if the camera frames arrive at a lower rate than that.

在以下位置有一个相关但较高级的问题:

There's a related but higher-level question over at: Lowest overhead camera to CPU to GPU approach on android

为了更详细地描述当前设置:我们有一个用于相机数据的应用程序侧缓冲池,其中的缓冲区为空闲",显示中"或待显示".当来自摄像机的新帧到达时,我们将获取一个可用缓冲区,将其存储在其中(如果实际数据位于某些系统提供的缓冲池中,则将其存储在其中),进行处理并将结果存储在缓冲区中,然后将缓冲区设置为待显示".在渲染器线程中,如果在渲染循环开始时有任何缓冲区待显示",我们将其锁存为一个在显示中",渲染照相机,并使用从该显示器计算出的已处理信息来渲染其他内容.相机镜框.

To describe the current setup in a bit more detail: we have an app-side buffer pool for camera data where buffers are either "free", "in display", or "pending display". When a new frame from the camera arrives we grab a free buffer, store the frame (or a reference to it if the actual data is in some system-provided buffer pool) in there, do the processing and stash the results in the buffer, then set the buffer "pending display". In the renderer thread if there is any buffer "pending display" at the start of the render loop we latch it to be the one "in display" instead, render the camera, and render the other content using the processed information calculated from the same camera frame.

感谢@fadden对上面链接的问题的答复,我现在了解android camera2 API的并行输出"功能在各个输出队列之间共享缓冲区,因此至少不应涉及数据的任何副本现代的android.

Thanks to @fadden's response on the question linked above I now understand the "parallel output" feature of the android camera2 API shares the buffers between the various output queues, so shouldn't involve any copies on the data, at least on modern android.

在评论中,有人建议我可以同时锁住SurfaceTexture和ImageReader输出,然后坐在缓冲区上"直到处理完成.不幸的是,由于我们仍然要以60 FPS的速度进行分离的渲染,因此我认为这不适用于我的情况,并且在处理新帧以确保不会出现问题时仍需要访问上一帧不同步.

In a comment there was a suggestion that I could latch the SurfaceTexture and ImageReader outputs at the same time and just "sit on the buffer" until the processing is complete. Unfortunately I don't think that's applicable in my case due to the decoupled rendering that we still want to drive at 60 FPS, and that will still need access to the previous frame whilst the new one is being processed to ensure things don't get out of sync.

想到的一个解决方案是拥有多个SurfaceTexture-在我们的每个应用程序端缓冲区中都有一个(我们目前使用3个).通过这种方案,当我们获得一个新的相机框架时,我们将从我们的应用程序侧池中获得一个空闲缓冲区.然后,我们在ImageReader上调用acquireLatestImage()以获得要处理的数据,然后在可用缓冲区的SurfaceTexture上调用updateTexImage().在渲染时,我们只需要确保"in display"缓冲区中的SufaceTexture是绑定到GL的SufaceTexture,并且大多数情况下所有内容都应该同步(如@fadden所说,调用updateTexImage()acquireLatestImage(),但该时间窗口应足够小以使其变得稀有,并且无论如何使用缓冲区中的时间戳可能是可分解且可修复的).

One solution that has come to mind is having multiple SurfaceTextures - one in each of our app-side buffers (we currently use 3). With that scheme when we get a new camera frame, we would obtain a free buffer from our app-side pool. Then we'd call acquireLatestImage() on an ImageReader to get the data for processing, and call updateTexImage() on the SurfaceTexture in the free buffer. At render time we just need to make sure the SufaceTexture from the "in display" buffer is the one bound to GL, and everything should be in sync most of the time (as @fadden commented there is a race between calling the updateTexImage() and acquireLatestImage() but that time window should be small enough to make it rare, and is perhaps dectable and fixable anyway using the timestamps in the buffers).

我在文档中注意到,只有当SurfaceTexture绑定到GL上下文时才能调用updateTexImage(),这表明我也需要在相机处理线程中使用GL上下文,以便相机线程可以执行updateTexImage()在自由"缓冲区中的SurfaceTexture上进行渲染,而渲染线程仍能够从在显示中"缓冲区中的SurfaceTexture进行渲染.

I note in the docs that updateTexImage() can only be called when the SurfaceTexture is bound to a GL context, which suggests I'll need a GL context in the camera processing thread too so the camera thread can do updateTexImage() on the SurfaceTexture in the "free" buffer whilst the render thread is still able to render from the SurfaceTexture from the "in display" buffer.

那么,问题来了:

  1. 这似乎是明智的做法吗?
  2. SurfaceTextures基本上是共享缓冲池周围的轻包装器,还是它们消耗了有限的硬件资源,应该谨慎使用?
  3. SurfaceTexture调用的价格是否足够便宜,以至于使用多个调用仍然比仅复制数据大获胜?
  4. 计划在两个线程中使用不同的GL上下文并在每个线程中绑定不同的SurfaceTexture,这是否可行?我是否正在寻求一个痛苦无穷的驱动程序世界?

这听起来很有前途,我将去尝试一下.但我认为值得在这里询问,以防万一有人(基本上是@fadden!)知道我忽略的任何内部细节,这将使它成为一个坏主意.

It sounds promising enough that I'm going to give it a go; but thought it worth asking here in case anyone (basically @fadden!) knows of any internal details that I've overlooked which would make this a bad idea.

推荐答案

有趣的问题.

背景资料

具有多个具有独立上下文的线程非常常见.每个使用硬件加速的View渲染的应用程序在主线程上都有一个GLES上下文,因此任何使用GLSurfaceView(或使用SurfaceView或TextureView和一个独立的渲染线程滚动自己的EGL)的应用程序都在积极地使用多个上下文.

Having multiple threads with independent contexts is very common. Every app that uses hardware-accelerated View rendering has a GLES context on the main thread, so any app that uses GLSurfaceView (or rolls their own EGL with a SurfaceView or TextureView and an independent render thread) is actively using multiple contexts.

每个TextureView内部都有一个SurfaceTexture,因此任何使用多个TextureViews的应用程序在单个线程上都具有多个SurfaceTexture. (该框架实际上在实现中有一个错误,该错误导致了多个TextureViews的问题,但是级别问题,而不是驱动程序问题.)

Every TextureView has a SurfaceTexture inside it, so any app that uses multiple TextureViews has multiple SurfaceTextures on a single thread. (The framework actually had a bug in its implementation that caused problems with multiple TextureViews, but that was a high-level issue, not a driver problem.)

a/k/a GLConsumerSurfaceTexture不会进行大量处理.当帧从源(在您的情况下为摄像机)到达时,它将使用一些EGL函数将缓冲区包装"为外部"纹理.您不能在没有EGL上下文的情况下进行这些EGL操作,这就是为什么SurfaceTexture必须附加到其中的原因,并且如果当前的上下文不正确,也无法将新框架放入纹理中.您可以从看到updateTexImage() 的实现,它使用缓冲区队列,纹理和栅栏执行了许多不可思议的事情,但是它们都不要求复制像素数据.您真正占用的唯一系统资源是RAM,如果要捕获高分辨率图像,这并不是很重要.

SurfaceTexture, a/k/a GLConsumer, doesn't do a whole lot of processing. When a frame arrives from the source (in your case, the camera), it uses some EGL functions to "wrap" the buffer as an "external" texture. You can't do these EGL operations without an EGL context to work in, which is why SurfaceTexture has to be attached to one, and why you can't put a new frame into a texture if the wrong context is current. You can see from the implementation of updateTexImage() that it's doing a lot of arcane things with buffer queues and textures and fences, but none of it requires copying pixel data. The only system resource you're really tying up is RAM, which is not inconsiderable if you're capturing high-resolution images.

连接

EGL上下文可以在线程之间移动,但一次只能在一个线程上当前".来自多个线程的同时访问将需要大量不良同步.给定线程只有一个当前"上下文. OpenGL API已从具有全局状态的单线程演变为多线程,而不是重写API,他们只是将状态推到线程本地存储中,因此采用了当前"的概念.

An EGL context can be moved between threads, but can only be "current" on one thread at a time. Simultaneous access from multiple threads would require a lot of undesirable synchronization. A given thread has only one "current" context. The OpenGL API evolved from single-threaded with global state to multi-threaded, and rather than rewrite the API they just shoved state into thread-local storage... hence the notion of "current".

可以创建在它们之间共享某些东西(包括纹理)的EGL上下文,但是如果这些上下文在不同的线程上,则在更新纹理时必须非常小心. Grafika提供了一个很好的例子,弄错了.

It's possible to create EGL contexts that share certain things between them, including textures, but if these contexts are on different threads you have to be very careful when the textures are updated. Grafika provides a nice example of getting it wrong.

SurfaceTextures建立在具有生产者-消费者结构的BufferQueue之上. SurfaceTextures的有趣之处在于它们同时包含了两面,因此您可以在一个过程中将数据提供到另一侧,而另一侧则可以拉出数据(例如,与SurfaceView不同,那里的使用者很远).像所有Surface东西一样,它们是在Binder IPC之上构建的,因此您可以从一个线程提供Surface,并安全地updateTexImage()在另一个线程(或进程)中.该API的排列方式是,您可以在使用者(您的过程)上创建SurfaceTexture,然后将引用传递给生产者(例如,主要在mediaserver进程中运行的相机).

SurfaceTextures are built on top of BufferQueues, which have a producer-consumer structure. The fun thing about SurfaceTextures is that they include both sides, so you can feed data in one side and pull it out the other within a single process (unlike, say, SurfaceView, where the consumer is far away). Like all Surface stuff, they're built on top of Binder IPC, so you can feed the Surface from one thread, and safely updateTexImage() in a different thread (or process). The API is arranged such that you create the SurfaceTexture on the consumer side (your process) and then pass a reference to the producer (e.g. camera, which primarily runs in the mediaserver process).

实施

如果您经常连接和断开BufferQueue,则会导致大量开销.因此,如果要让三个SurfaceTextures接收缓冲区,则需要将所有三个连接到Camera2的输出,并让它们全部接收广播的缓冲区".然后您以循环方式updateTexImage().由于SurfaceTexture的BufferQueue在异步"模式下运行,因此每次调用时都应始终获取最新的帧,而无需排空"队列.

You'll induce a bunch of overhead if you're constantly connecting and disconnecting BufferQueues. So if you want to have three SurfaceTextures receiving buffers, you'll need to connect all three to Camera2's output, and let all of them receive the "buffer broadcast". Then you updateTexImage() in a round-robin fashion. Since SurfaceTexture's BufferQueue runs in "async" mode, you should always get the newest frame with each call, with no need to "drain" a queue.

在Lollipop时代的BufferQueue多输出更改和引入Camera2之前,这种安排实际上是不可能的,所以我不知道是否有人尝试过这种方法.

This arrangement wasn't really possible until the Lollipop-era BufferQueue multi-output changes and the introduction of Camera2, so I don't know if anyone has tried this approach before.

所有SurfaceTextures都将附加到相同的EGL上下文,最好是在View UI线程之外的其他线程中,因此您不必为当前问题而战.如果要从另一个线程中的第二个上下文访问纹理,则需要使用SurfaceTexture

All of the SurfaceTextures would be attached to the same EGL context, ideally in a thread other than the View UI thread, so you don't have to fight over what's current. If you want to access the texture from a second context in a different thread, you will need to use the SurfaceTexture attach/detach API calls, which explicitly support this approach:

创建一个新的OpenGL ES纹理对象,并使用上一次调用detachFromGLContext()时最新的SurfaceTexture图像帧进行填充.

A new OpenGL ES texture object is created and populated with the SurfaceTexture image frame that was current at the time of the last call to detachFromGLContext().

请记住,切换EGL上下文是消费者方的操作,与相机的连接无关,这是生产方的操作.在上下文之间移动SurfaceTexture所涉及的开销应该很小-小于updateTexImage(),但是您需要采取通常的步骤来确保线程之间进行通信时保持同步.

Remember that switching EGL contexts is a consumer-side operation, and has no bearing on the connection to the camera, which is a producer-side operation. The overhead involved in moving a SurfaceTexture between contexts should be minor -- less than updateTexImage() -- but you need to take the usual steps to ensure synchronization when communicating between threads.

太糟糕了,ImageReader缺少getTimestamp()调用,因为这将大大简化从相机匹配缓冲区的过程.

It's too bad ImageReader lacks a getTimestamp() call, as that would greatly simplify matching up buffers from the camera.

结论

使用多个SurfaceTextures缓冲输出是可能的,但是很棘手.我可以看到乒乓缓冲区方法的潜在优势,其中一个ST用于在线程/上下文A中接收帧,而另一个ST用于在线程/上下文B中进行渲染,但是由于您是在真实环境中操作时间,除非您试图延长时间,否则我认为额外的缓冲没有任何价值.

Using multiple SurfaceTextures to buffer output is possible but tricky. I can see a potential advantage to a ping-pong buffer approach, where one ST is used to receive a frame in thread/context A while the other ST is used for rendering in thread/context B, but since you're operating in real time I don't think there's value in additional buffering unless you're trying to pad out the timing.

一如既往,建议阅读 Android系统级图形体系结构文档

这篇关于Android上的零拷贝相机处理和渲染管道的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆