如何减少 MediaCodec 视频/avc 解码的延迟 [英] How to reduce latency in MediaCodec video/avc decoding

查看:174
本文介绍了如何减少 MediaCodec 视频/avc 解码的延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 MoviePlayer 进行了一些简单的计时.java 在 Nexus 5 上运行的 Grafika MediaCodec 示例代码中.我在这些位置:

I performed some simple timing of MoviePlayer.java in the Grafika MediaCodec sample code running on a Nexus 5. I put a log statement at these locations:

在第 203 行之前

decoder.queueInputBuffer

在第 244 行之后

decoder.dequeueOutputBuffer

我使用 presentationTimeUs 关联了日志语句.

I correlated the log statements using presentationTimeUs.

以下是 logcat 的摘录:

Here is an excerpt from logcat:

01-29 10:56:43.295: I/Grafika(21286): queueInputBuffer index/pts, 2,0
01-29 10:56:43.305: I/Grafika(21286): queueInputBuffer index/pts, 0,33100
01-29 10:56:43.315: I/Grafika(21286): queueInputBuffer index/pts, 3,66466
01-29 10:56:43.325: I/Grafika(21286): queueInputBuffer index/pts, 1,99833
01-29 10:56:43.325: I/Grafika(21286): queueInputBuffer index/pts, 2,133200
01-29 10:56:43.335: I/Grafika(21286): queueInputBuffer index/pts, 0,166566
01-29 10:56:43.345: I/ATSParser(21286): discontinuity on stream pid 0x1011
01-29 10:56:43.345: I/ATSParser(21286): discontinuity on stream pid 0x1100
01-29 10:56:43.345: I/Grafika(21286): queueInputBuffer index/pts, 3,199933
01-29 10:56:43.345: I/Grafika(21286): dequeueOutputBuffer index/pts, 7,0
01-29 10:56:43.345: I/Grafika(21286): queueInputBuffer index/pts, 1,300033
01-29 10:56:43.355: I/Grafika(21286): dequeueOutputBuffer index/pts, 6,33100
01-29 10:56:43.385: I/Grafika(21286): queueInputBuffer index/pts, 2,333400
01-29 10:56:43.385: I/Grafika(21286): dequeueOutputBuffer index/pts, 5,66466
01-29 10:56:43.415: I/Grafika(21286): queueInputBuffer index/pts, 0,366766
01-29 10:56:43.415: I/Grafika(21286): dequeueOutputBuffer index/pts, 4,99833
01-29 10:56:43.445: I/Grafika(21286): queueInputBuffer index/pts, 3,400133
01-29 10:56:43.445: I/Grafika(21286): dequeueOutputBuffer index/pts, 3,133200

我发现从第一个输入缓冲区排队到相应的输出缓冲区出队的时间差为 50 毫秒.对于硬件加速解码来说,这似乎需要很多时间.

I found the time difference from when the first input buffer is queued to when the corresponding output buffer is dequeued is 50 millis. This seems like a lot of time for HW accelerated decoding.

有没有办法减少这种延迟?

推荐答案

我认为您看到了第一帧特有的一些效果.我重复了您的实验,并在第 244 行附近进一步添加了 doRender = false 以避免用于管理输出帧速率的睡眠调用.我看到了:

I think you're seeing some effects unique to the first frame. I repeated your experiment, with the further addition of forcing doRender = false around line 244 to avoid the sleep calls used to manage the output frame rate. I see:

01-29 14:05:36.552  9115  9224 I Grafika : queueInputBuffer index/pts, 2,0
01-29 14:05:36.562  9115  9224 I Grafika : queueInputBuffer index/pts, 0,66655
01-29 14:05:36.572  9115  9224 I Grafika : queueInputBuffer index/pts, 3,133288
01-29 14:05:36.582  9115  9224 I Grafika : queueInputBuffer index/pts, 1,199955

01-29 14:05:36.602  9115  9224 I Grafika : dequeueOutputBuffer index/pts, 4,0
01-29 14:05:36.602  9115  9224 I Grafika : dequeueOutputBuffer index/pts, 3,66655
01-29 14:05:36.602  9115  9224 I Grafika : dequeueOutputBuffer index/pts, 2,133288
01-29 14:05:36.612  9115  9224 I Grafika : dequeueOutputBuffer index/pts, 4,199955

(为清楚起见,删除了多余的行.)这证实了您的结果.请注意,虽然 pts=0 的输入和输出之间存在 50 毫秒的延迟,但随后的输出帧几乎立即可用.我使用的视频是camera-test.mp4"(720p 摄像头输出).

(Extraneous lines removed for clarity.) This confirms your results. Note that, while there was a 50ms lag between input and output for pts=0, the subsequent output frames were available almost instantly. The video I used was "camera-test.mp4" (720p camera output).

要深入了解发生这种情况的原因,请查看日志中的其他内容及其出现的位置.从第一个 queueInputBuffer 日志行开始,计算出现在该行和第一个 dequeueOutputBuffer 行之间的日志数.我在我的 OMX-VDC-1080P 上计算了大约 60 行输出.现在计算输出缓冲区开始出现后出现的 OMX-VDEC 行数.在视频结束之前我什么都看不到.

For insight into why this is happening, take a look at the other stuff in the log, and where it appears. Starting from the first queueInputBuffer log line, count the number of logs that appear between that and the first dequeueOutputBuffer line. I count about 60 lines of output from OMX-VDEC-1080P on mine. Now count how many OMX-VDEC lines appear after output buffers start appearing. I see none until the video ends.

视频解码器显然推迟了一些昂贵的初始化,直到数据可用.那么下一个问题是……它需要多少数据?我在提交第二帧后添加了 500 毫秒的睡眠(pts==66633).结果:两帧提交,500ms 暂停,两帧提交,一大堆 OMX-VDEC 日志.所以看起来解码器在开始之前需要几帧.

The video decoder is clearly deferring some expensive initialization until data is available. So the next question is... how much data does it need? I added a 500ms sleep after submitting the second frame (pts==66633). The result: two frames submitted, 500ms pause, two frames submitted, big pile of OMX-VDEC logs. So it seems that the decoder wants several frames before it will start.

这表明我们可以通过快速输入前几帧来减少启动延迟.为了测试这一点,我将 TIMEOUT_USEC 更改为零,因此它会快速响应但会消耗 CPU.新的日志输出(你的日志,没有睡眠,没有渲染):

This suggests that we can reduce the start-up latency by feeding the first few frames in quickly. To test that, I changed TIMEOUT_USEC to zero, so it'll respond quickly but burn CPU. New log output (your logs, no sleep, no rendering):

01-29 14:29:04.542 10560 10599 I Grafika : queueInputBuffer index/pts, 0,0
01-29 14:29:04.542 10560 10599 I Grafika : queueInputBuffer index/pts, 2,66633
01-29 14:29:04.542 10560 10599 I Grafika : queueInputBuffer index/pts, 3,133288
...
01-29 14:29:04.572 10560 10599 I Grafika : dequeueOutputBuffer index/pts, 4,0
01-29 14:29:04.572 10560 10599 I Grafika : dequeueOutputBuffer index/pts, 3,66633
01-29 14:29:04.572 10560 10599 I Grafika : dequeueOutputBuffer index/pts, 2,133288

通过快速输入初始帧,我们将初始延迟从 50 毫秒减少到 30 毫秒.

By feeding the initial frames in quickly, we've reduced the initial latency from 50ms to 30ms.

(注意所有时间戳如何以2"结尾?用于记录时间的计时器似乎四舍五入到最接近的 10 毫秒,因此实际时间增量可能略有不同.)

(Notice how all the timestamps end in '2'? The timer used for the log times appears to be rounding to the nearest 10ms, so the actual time delta may be slightly different.)

我们缓慢地提供初始帧的原因是我们试图在提交每个输入缓冲区后从解码器中排出输出,等待 10 毫秒以等待从未出现的输出.我最初的想法是,我们希望在 either dequeueInputBuffer() or dequeueOutputBuffer() 上等待超时>,但不能同时使用——可能首先使用输入超时和输出的快速轮询,然后当我们的输入用完时切换到输出超时.(就此而言,输入的初始超时可能是 -1,因为我们知道在第一个输入缓冲区排队之前不会发生任何事情.)

The reason we're feeding the initial frames slowly is that we're trying to drain output from the decoder after each input buffer is submitted, waiting 10ms for output that never appears. My initial thought is that we want to want to wait for a timeout on either dequeueInputBuffer() or dequeueOutputBuffer(), but not both -- maybe use a timeout on input and a quick poll for output at first, then switch to a timeout on output when we run out of input to feed. (For that matter, the initial timeout for input could be -1, since we know nothing is going to happen until the first input buffer is queued.)

我不知道是否有办法进一步减少延迟.

I don't know if there is a way to reduce the latency further.

这篇关于如何减少 MediaCodec 视频/avc 解码的延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆