使用libav(ffmpeg)进行框架寻找/阅读时使用时间戳/时间段有什么问题? [英] What's wrong with my use of timestamps/timebases for frame seeking/reading using libav (ffmpeg)?
问题描述
所以我想使用 libav 在特定时间 中使用 使用缩略图 。 >
我使用的是以下代码。它编译和工作正常(关于完全检索图片),但我很难得到检索正确的图片。
我完全不能把我的头脑围绕着libav显而易见的使用每个视频的多个时基。具体弄清楚哪些函数期望/返回哪种类型的时基。
不幸的是,文档基本上没有任何帮助。为了救援?
#define ABORT(x)do {fprintf(stderr,x); exit(1);} while(0)
av_register_all();
AVFormatContext * format_context = ...;
AVCodec * codec = ...;
AVStream * stream = ...;
AVCodecContext * codec_context = ...;
int stream_index = ...;
//打开codec_context等
AVRational stream_time_base = stream-> time_base;
AVRational codec_time_base = codec_context-> time_base;
printf(stream_time_base:%d /%d =%.5f\\\
,stream_time_base.num,stream_time_base.den,av_q2d(stream_time_base));
printf(codec_time_base:%d /%d =%.5f\\\
\\\
,codec_time_base.num,codec_time_base.den,av_q2d(codec_time_base));
AVFrame * frame = avcodec_alloc_frame();
printf(duration:%lld @%d / sec(%.2f sec)\\\
,format_context-> duration,AV_TIME_BASE,(double)format_context-> duration / AV_TIME_BASE) ;
printf(duration:%lld @%d / sec(stream time base)\\\
\\\
,format_context-> duration / AV_TIME_BASE * stream_time_base.den,stream_time_base.den);
printf(duration:%lld @%d / sec(codec time base)\\\
,format_context-> duration / AV_TIME_BASE * codec_time_base.den,codec_time_base.den)
double request_time = 10.0; // 10秒视频的总时长为〜20秒
int64_t request_timestamp = request_time / av_q2d(stream_time_base);
printf(requested:%.2f(sec)\t->%2lld(pts)\\\
,request_time,request_timestamp);
av_seek_frame(format_context,stream_index,request_timestamp,0);
AVPacket包;
int frame_finished;
do {
if(av_read_frame(format_context,& packet)< 0){
break;
} else if(packet.stream_index!= stream_index){
av_free_packet(& packet);
继续;
}
avcodec_decode_video2(codec_context,frame,& frame_finished,& packet);
} while(!frame_finished);
//做一些框架
int64_t received_timestamp = frame-> pkt_pts;
double received_time = received_timestamp * av_q2d(stream_time_base);
printf(received:%.2f(sec)\t->%2lld(pts)\\\
\\\
,received_time,received_timestamp);
运行测试电影文件我得到这个输出:
stream_time_base:1/30000 = 0.00003
codec_time_base:50/2997 = 0.01668
持续时间:20062041 @ 1000000 /秒(20.06秒)
持续时间:600000 @ 30000 /秒(流时基)
持续时间:59940 @ 2997 /秒(编解码器时基)
请求:10.00(秒) - > ; 300000(pts)
收到:0.07(秒) - > 2002(pts)
时代不匹配。这里发生了什么?我在做什么错?
在搜索线索时,我偶然发现了这个从libav用户邮件列表...
[...] 数据包PTS / DTS 是单位的格式上下文的time_base ,
,其中 AVFrame-> pts 值以编解码器上下文的time_base 为单位。
换句话说,容器可以具有(通常是)与编解码器不同的
time_base。大多数libav播放器不用麻烦使用
编解码器的time_base或pts,因为并不是所有的编解码器都有一个,而是大多数
的容器。 (这就是为什么dranger教程说忽略AVFrame-> pts)
...这让我更加困惑,无论如何,请在官方文档中找到任何这样的提及。
无论如何,我更换了...
double received_time = received_timestamp * av_q2d(stream_time_base);
... with ...
double received_time = received_timestamp * av_q2d(codec_time_base);
...输出更改为...
...
请求:10.00(秒) - > 300000(pts)
收到:33.40(秒) - > 2002(pts)
仍然没有匹配。有什么问题? >
主要是这样的:
- p>流时基是你真正感兴趣的,它是数据包时间戳,而且在输出帧上也是
pkt_pts
(因为它刚从相应的数据包复制) -
编解码器时基是(如果设置的话)只是可能写在编解码器级标头中的帧率的倒数。在没有容器时间信息的情况下(例如,当您读取原始视频时)可能会有用,但是否则可以安全地忽略。
-
AVFrame .pkt_pts是已解码到此帧中的数据包的时间戳。如上所述,它只是从数据包的一个直接的副本,所以它在流时基。这是您要使用的字段(如果容器有时间戳)。
-
AVFrame.pts在解码时不会被设置为有用,忽略它它可能会替代
pkt_pts
,以使整个混乱不那么混乱,但现在是这样的,主要是历史原因)。 -
格式上下文的持续时间在
AV_TIME_BASE
(即微秒)。它不能在任何流时基,因为你可以有三个bazillion流,每个都有自己的时基。 -
你看到的问题,获得不同的时间戳寻求只是寻求不准确。在大多数情况下,您只能寻求最接近的关键帧,因此常常需要几秒钟的时间。 $ / $>
So I want to grab a frame from a video at a specific time using libav for the use as a thumbnail.
What I'm using is the following code. It compiles and works fine (in regards to retrieving a picture at all), yet I'm having a hard time getting it to retrieve the right picture.
I simply can't get my head around the all but clear logic behind libav's apparent use of multiple time-bases per video. Specifically figuring out which functions expect/return which type of time-base.
The docs were of basically no help whatsoever, unfortunately. SO to the rescue?
#define ABORT(x) do {fprintf(stderr, x); exit(1);} while(0)
av_register_all();
AVFormatContext *format_context = ...;
AVCodec *codec = ...;
AVStream *stream = ...;
AVCodecContext *codec_context = ...;
int stream_index = ...;
// open codec_context, etc.
AVRational stream_time_base = stream->time_base;
AVRational codec_time_base = codec_context->time_base;
printf("stream_time_base: %d / %d = %.5f\n", stream_time_base.num, stream_time_base.den, av_q2d(stream_time_base));
printf("codec_time_base: %d / %d = %.5f\n\n", codec_time_base.num, codec_time_base.den, av_q2d(codec_time_base));
AVFrame *frame = avcodec_alloc_frame();
printf("duration: %lld @ %d/sec (%.2f sec)\n", format_context->duration, AV_TIME_BASE, (double)format_context->duration / AV_TIME_BASE);
printf("duration: %lld @ %d/sec (stream time base)\n\n", format_context->duration / AV_TIME_BASE * stream_time_base.den, stream_time_base.den);
printf("duration: %lld @ %d/sec (codec time base)\n", format_context->duration / AV_TIME_BASE * codec_time_base.den, codec_time_base.den);
double request_time = 10.0; // 10 seconds. Video's total duration is ~20sec
int64_t request_timestamp = request_time / av_q2d(stream_time_base);
printf("requested: %.2f (sec)\t-> %2lld (pts)\n", request_time, request_timestamp);
av_seek_frame(format_context, stream_index, request_timestamp, 0);
AVPacket packet;
int frame_finished;
do {
if (av_read_frame(format_context, &packet) < 0) {
break;
} else if (packet.stream_index != stream_index) {
av_free_packet(&packet);
continue;
}
avcodec_decode_video2(codec_context, frame, &frame_finished, &packet);
} while (!frame_finished);
// do something with frame
int64_t received_timestamp = frame->pkt_pts;
double received_time = received_timestamp * av_q2d(stream_time_base);
printf("received: %.2f (sec)\t-> %2lld (pts)\n\n", received_time, received_timestamp);
Running this with a test movie file I get this output:
stream_time_base: 1 / 30000 = 0.00003
codec_time_base: 50 / 2997 = 0.01668
duration: 20062041 @ 1000000/sec (20.06 sec)
duration: 600000 @ 30000/sec (stream time base)
duration: 59940 @ 2997/sec (codec time base)
requested: 10.00 (sec) -> 300000 (pts)
received: 0.07 (sec) -> 2002 (pts)
The times don't match. What's going on here? What am I doing wrong?
While searching for clues I stumbled upon this this statement from the libav-users mailing list…
[...] packet PTS/DTS are in units of the format context's time_base,
where the AVFrame->pts value is in units of the codec context's time_base.In other words, the container can have (and usually does) a different time_base than the codec. Most libav players don't bother using the codec's time_base or pts since not all codecs have one, but most containers do. (This is why the dranger tutorial says to ignore AVFrame->pts)
…which confused me even more, given that I couldn't find any such mention in the official docs.
Anyway, I replaced…
double received_time = received_timestamp * av_q2d(stream_time_base);
…with…
double received_time = received_timestamp * av_q2d(codec_time_base);
…and the output changed to this…
...
requested: 10.00 (sec) -> 300000 (pts)
received: 33.40 (sec) -> 2002 (pts)
Still no match. What's wrong?
It's mostly like this:
the stream timebase is what you are really interested in. It's what the packet timestamps are in, and also
pkt_pts
on the output frame (since it's just copied from the corresponding packet).the codec timebase is (if set at all) just the inverse of the framerate that might be written in the codec-level headers. It can be useful in cases where there is no container timing information (e.g. when you're reading raw video), but otherwise can be safely ignored.
AVFrame.pkt_pts is the timestamp of the packet that got decoded into this frame. As already said, it's just a straight copy from the packet, so it's in the stream timebase. This is the field you want to use (if the container has timestamps).
AVFrame.pts is not ever set to anything useful when decoding, ignore it (it might replace
pkt_pts
in the future, to make the whole mess less confusing, but for now it's like this, for historical reasons mostly).the format context's duration is in
AV_TIME_BASE
(i.e. microseconds). It cannot be in any stream timebase, since you can have three bazillion streams, each with its own timebase.the problem you see with getting a different timestamp after seeking is simply that seeking is not accurate. In most cases you can only seek to closest keyframe, so it's common to be a couple seconds off. Decoding and discarding the frames you don't need must be done manually.
这篇关于使用libav(ffmpeg)进行框架寻找/阅读时使用时间戳/时间段有什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!