MP4 原子解析 - 在哪里配置时间...? [英] MP4 Atom Parsing - where to configure time...?

查看:22
本文介绍了MP4 原子解析 - 在哪里配置时间...?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个 MP4 解析器,它可以很好地读取 MP4 中的原子,然后将它们拼接在一起 - 结果是 Quicktime 可以打开一个技术上有效的 MP4 文件,但它不能像这样播放任何音频我相信时序/采样信息全部关闭.我应该提到我只对音频感兴趣.

I've written an MP4 parser that can read atoms in an MP4 just fine, and stitch them back together - the result is a technically valid MP4 file that Quicktime can open and such, but it can't play any audio as I believe the timing/sampling information is all off. I should probably mention I'm only interested in audio.

我正在做的是尝试从现有 MP4 中获取 moov atom/etc,然后仅获取文件中 mdat atom 的一个子集来创建一个新的、更小的 MP4.为此,我更改了 mvhd 原子中的 duration,以及 mdia 标头中的持续时间.这个文件中没有 tkhd 原子可以编辑,所以我相信我不需要改变那里的持续时间 - 我错过了什么?

What I'm doing is trying to take the moov atoms/etc from an existing MP4, and then take only a subset of the mdat atom in the file to create a new, smaller MP4. In doing so I've altered the duration in the mvhd atom, as well as the duration in the mdia header. There are no tkhd atoms in this file that have edits, so I believe I don't need to alter the durations there - what am I missing?

在创建新的 MP4 时,我正确地将 mdat 块与一个宽框分开,并将mdat"标题/大小保持在正确的位置 - 我确保更新大小新内容.

In creating the new MP4 I'm properly sectioning the mdat block with a wide box, and keeping the 'mdat' header/size in their right places - I make sure to update the size with the new content.

现在完全有 110% 可能我遗漏了一些关于格式的重要信息,但如果可能的话,我很想得到最后的作品.有人有任何意见/想法吗?

Now it's entirely 110% possible I'm missing something crucial about the format, but if this is possible I'd love to get the final piece. Anybody got any input/ideas?

代码可以在以下链接中找到:

Code can be found at the following link:

https://gist.github.com/ryanmcgrath/958c602cff133bd7fa0b

推荐答案

我将在这里暗中指出您没有正确更新 stbl 偏移量.至少我没有(乍一看)看到你的蟒蛇在任何地方这样做.

I'm going to take a stab in the dark here and say that you're not updating your stbl offsets properly. At least I didn't (at first glance) see your python doing that anywhere.

让我们从数据的位置开始.数据包以块的形式写入文件,头部告诉解码器这些块的每个块"存在的位置.stsc 表表示每个块存在多少项.first chunk 表示新块的开始位置.这有点令人困惑,但看看我的例子.这就是说每个块有 100 个样本,直到第 8 个块.在第 8 个块有 98 个样本.

Lets start with the location of data. Packets are written into the file in terms of chunks, and the header tells the decoder where each "block" of these chunks exists. The stsc table says how many items per chunk exist. The first chunk says where that new chunk starts. It's a little confusing, but look at my example. This is saying that you have 100 samples per chunkk, up to the 8th chunk. At the 8th chunk there are 98 samples.

也就是说,您还必须跟踪这些块的偏移量在哪里.这就是 stco 表的工作.因此,文件中的块偏移量为 1 或块偏移量 2 等.

That said, you also have to track where the offsets of these chunks are. That's the job of the stco table. So, where in the file is chunk offset 1, or chunk offset 2, etc.

如果您修改了 mdat 中的任何数据,您必须维护这些表.您不能只是将 mdat 数据切掉,然后期望解码器知道该做什么.

If you modify any data in mdat you have to maintain these tables. You can't just chop mdat data out, and expect the decoder to know what to do.

好像这还不够,现在您还必须维护样本时间表 (stts) 样本大小表 (stsz) 以及如果这是视频, 同步样本表 (stss).

As if this wasn't enough, now you have to also maintain the sample time table (stts) the sample size table (stsz) and if this was video, the sync sample table (stss).

stts 表示样本应该以时间尺度为单位播放多长时间.如果您正在处理音频,时间刻度可能是 44100 或 48000 (kHz).

stts says how long a sample should play for in units of the timescale. If you're doing audio the timescale is probably 44100 or 48000 (kHz).

如果您删除了一些数据,现在一切都可能不同步.如果这里的所有值都具有完全相同的持续时间,那么您就可以了.

If you've lopped off some data, now everything could potentially be out of sync. If all the values here have the exact same duration though you'd be OK.

stsz 表示每个样本的大小(以字节为单位).这对于解码器能够从一个块开始,然后按其大小遍历每个样本很重要.

stsz says what size each sample is in bytes. This is important for the decoder to be able to start at a chunk, and then go through each sample by its size.

同样,如果所有样本大小完全相同,您就可以了.音频往往几乎相同,但视频内容差异很大(关键帧等等)

Again, if all the sample sizes are exactly the same you'd be OK. Audio tends to be pretty much the same, but video stuff varies a lot (with keyframes and whatnot)

最后但并非最不重要的是,我们有 stss 表,它说明哪些帧是关键帧.我只有 AAC 的经验,但每个音频帧都被视为关键帧.在这种情况下,您可以使用一个条目来描述所有数据包.

And last but not least we have the stss table which says which frame's are keyframes. I only have experience with AAC, but every audio frame is considered a keyframe. In that case you can have one entry that describes all the packets.

关于您最初的问题,每个玩家的时间显示并不总是以相同的方式显示.最准确的方法是将标头中所有帧的持续时间相加,并将其用作总时间.其他播放器使用轨道标题中的元数据.我发现最好保持所有值相同,然后玩家就会高兴.

In relation to your original question, the time display isn't always honored the same way in each player. The most accurate way is to sum up the durations of all the frames in the header and use that as the total time. Other players use the metadata in the track headers. I've found it best to just keep all the values the same and then players are happy.

如果您正在执行所有这些操作而我在脚本中遗漏了它,那么您可以发布一个示例 mp4 和一个独立的应用程序,我可以尝试帮助您.

If you're doing all that and I missed it in the script then can you post a sample mp4 and a standalone app and I can try to help you out.

这篇关于MP4 原子解析 - 在哪里配置时间...?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆