使用libav(ffmpeg)进行框架寻找/阅读时使用时间戳/时间段有什么问题? [英] What's wrong with my use of timestamps/timebases for frame seeking/reading using libav (ffmpeg)?

查看:188
本文介绍了使用libav(ffmpeg)进行框架寻找/阅读时使用时间戳/时间段有什么问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我想使用 libav 在特定时间 中使用 使用缩略图 。 >

我使用的是以下代码。它编译和工作正常(关于完全检索图片),但我很难得到检索正确的图片



我完全不能把我的头脑围绕着libav显而易见的使用每个视频的多个时基。具体弄清楚哪些函数期望/返回哪种类型的时基。



不幸的是,文档基本上没有任何帮助。为了救援?

  #define ABORT(x)do {fprintf(stderr,x); exit(1);} while(0)

av_register_all();

AVFormatContext * format_context = ...;
AVCodec * codec = ...;
AVStream * stream = ...;
AVCodecContext * codec_context = ...;
int stream_index = ...;

//打开codec_context等

AVRational stream_time_base = stream-> time_base;
AVRational codec_time_base = codec_context-> time_base;

printf(stream_time_base:%d /%d =%.5f\\\
,stream_time_base.num,stream_time_base.den,av_q2d(stream_time_base));
printf(codec_time_base:%d /%d =%.5f\\\
\\\
,codec_time_base.num,codec_time_base.den,av_q2d(codec_time_base));

AVFrame * frame = avcodec_alloc_frame();

printf(duration:%lld @%d / sec(%.2f sec)\\\
,format_context-> duration,AV_TIME_BASE,(double)format_context-> duration / AV_TIME_BASE) ;
printf(duration:%lld @%d / sec(stream time base)\\\
\\\
,format_context-> duration / AV_TIME_BASE * stream_time_base.den,stream_time_base.den);
printf(duration:%lld @%d / sec(codec time base)\\\
,format_context-> duration / AV_TIME_BASE * codec_time_base.den,codec_time_base.den)

double request_time = 10.0; // 10秒视频的总时长为〜20秒
int64_t request_timestamp = request_time / av_q2d(stream_time_base);
printf(requested:%.2f(sec)\t->%2lld(pts)\\\
,request_time,request_timestamp);

av_seek_frame(format_context,stream_index,request_timestamp,0);

AVPacket包;
int frame_finished;
do {
if(av_read_frame(format_context,& packet)< 0){
break;
} else if(packet.stream_index!= stream_index){
av_free_packet(& packet);
继续;
}
avcodec_decode_video2(codec_context,frame,& frame_finished,& packet);
} while(!frame_finished);

//做一些框架

int64_t received_timestamp = frame-> pkt_pts;
double received_time = received_timestamp * av_q2d(stream_time_base);
printf(received:%.2f(sec)\t->%2lld(pts)\\\
\\\
,received_time,received_timestamp);

运行测试电影文件我得到这个输出:

  stream_time_base:1/30000 = 0.00003 
codec_time_base:50/2997 = 0.01668

持续时间:20062041 @ 1000000 /秒(20.06秒)
持续时间:600000 @ 30000 /秒(流时基)
持续时间:59940 @ 2997 /秒(编解码器时基)

请求:10.00(秒) - > ; 300000(pts)
收到:0.07(秒) - > 2002(pts)

时代不匹配。这里发生了什么?我在做什么错?






在搜索线索时,我偶然发现了这个从libav用户邮件列表...


[...] 数据包PTS / DTS 是单位的格式上下文的time_base

,其中 AVFrame-> pts 值以编解码器上下文的time_base 为单位。



换句话说,容器可以具有(通常是)与编解码器不同的
time_base。大多数libav播放器不用麻烦使用
编解码器的time_base或pts,因为并不是所有的编解码器都有一个,而是大多数
的容器。 (这就是为什么dranger教程说忽略AVFrame-> pts)


...这让我更加困惑,无论如何,请在官方文档中找到任何这样的提及。



无论如何,我更换了...

  double received_time = received_timestamp * av_q2d(stream_time_base); 

... with ...

  double received_time = received_timestamp * av_q2d(codec_time_base); 

...输出更改为...

  ... 

请求:10.00(秒) - > 300000(pts)
收到:33.40(秒) - > 2002(pts)

仍然没有匹配。有什么问题? >

解决方案

主要是这样的:




  • p>流时基是你真正感兴趣的,它是数据包时间戳,而且在输出帧上也是 pkt_pts (因为它刚从相应的数据包复制)


  • 编解码器时基是(如果设置的话)只是可能写在编解码器级标头中的帧率的倒数。在没有容器时间信息的情况下(例如,当您读取原始视频时)可能会有用,但是否则可以安全地忽略。


  • AVFrame .pkt_pts是已解码到此帧中的数据包的时间戳。如上所述,它只是从数据包的一个直接的副本,所以它在流时基。这是您要使用的字段(如果容器有时间戳)。


  • AVFrame.pts在解码时不会被设置为有用,忽略它它可能会替代 pkt_pts ,以使整个混乱不那么混乱,但现在是这样的,主要是历史原因)。


  • 格式上下文的持续时间在 AV_TIME_BASE (即微秒)。它不能在任何流时基,因为你可以有三个bazillion流,每个都有自己的时基。


  • 你看到的问题,获得不同的时间戳寻求只是寻求不准确。在大多数情况下,您只能寻求最接近的关键帧,因此常常需要几秒钟的时间。 $ / $>


So I want to grab a frame from a video at a specific time using libav for the use as a thumbnail.

What I'm using is the following code. It compiles and works fine (in regards to retrieving a picture at all), yet I'm having a hard time getting it to retrieve the right picture.

I simply can't get my head around the all but clear logic behind libav's apparent use of multiple time-bases per video. Specifically figuring out which functions expect/return which type of time-base.

The docs were of basically no help whatsoever, unfortunately. SO to the rescue?

#define ABORT(x) do {fprintf(stderr, x); exit(1);} while(0)

av_register_all();

AVFormatContext *format_context = ...;
AVCodec *codec = ...;
AVStream *stream = ...;
AVCodecContext *codec_context = ...;
int stream_index = ...;

// open codec_context, etc.

AVRational stream_time_base = stream->time_base;
AVRational codec_time_base = codec_context->time_base;

printf("stream_time_base: %d / %d = %.5f\n", stream_time_base.num, stream_time_base.den, av_q2d(stream_time_base));
printf("codec_time_base: %d / %d = %.5f\n\n", codec_time_base.num, codec_time_base.den, av_q2d(codec_time_base));

AVFrame *frame = avcodec_alloc_frame();

printf("duration: %lld @ %d/sec (%.2f sec)\n", format_context->duration, AV_TIME_BASE, (double)format_context->duration / AV_TIME_BASE);
printf("duration: %lld @ %d/sec (stream time base)\n\n", format_context->duration / AV_TIME_BASE * stream_time_base.den, stream_time_base.den);
printf("duration: %lld @ %d/sec (codec time base)\n", format_context->duration / AV_TIME_BASE * codec_time_base.den, codec_time_base.den);

double request_time = 10.0; // 10 seconds. Video's total duration is ~20sec
int64_t request_timestamp = request_time / av_q2d(stream_time_base);
printf("requested: %.2f (sec)\t-> %2lld (pts)\n", request_time, request_timestamp);

av_seek_frame(format_context, stream_index, request_timestamp, 0);

AVPacket packet;
int frame_finished;
do {
    if (av_read_frame(format_context, &packet) < 0) {
        break;
    } else if (packet.stream_index != stream_index) {
        av_free_packet(&packet);
        continue;
    }
    avcodec_decode_video2(codec_context, frame, &frame_finished, &packet);
} while (!frame_finished);

// do something with frame

int64_t received_timestamp = frame->pkt_pts;
double received_time = received_timestamp * av_q2d(stream_time_base);
printf("received:  %.2f (sec)\t-> %2lld (pts)\n\n", received_time, received_timestamp);

Running this with a test movie file I get this output:

    stream_time_base: 1 / 30000 = 0.00003
    codec_time_base: 50 / 2997 = 0.01668

    duration: 20062041 @ 1000000/sec (20.06 sec)
    duration: 600000 @ 30000/sec (stream time base)
    duration: 59940 @ 2997/sec (codec time base)

    requested: 10.00 (sec)  -> 300000 (pts)
    received:  0.07 (sec)   -> 2002 (pts)

The times don't match. What's going on here? What am I doing wrong?


While searching for clues I stumbled upon this this statement from the libav-users mailing list…

[...] packet PTS/DTS are in units of the format context's time_base,
where the AVFrame->pts value is in units of the codec context's time_base.

In other words, the container can have (and usually does) a different time_base than the codec. Most libav players don't bother using the codec's time_base or pts since not all codecs have one, but most containers do. (This is why the dranger tutorial says to ignore AVFrame->pts)

…which confused me even more, given that I couldn't find any such mention in the official docs.

Anyway, I replaced…

double received_time = received_timestamp * av_q2d(stream_time_base);

…with…

double received_time = received_timestamp * av_q2d(codec_time_base);

…and the output changed to this…

...

requested: 10.00 (sec)  -> 300000 (pts)
received:  33.40 (sec)  -> 2002 (pts)

Still no match. What's wrong?

解决方案

It's mostly like this:

  • the stream timebase is what you are really interested in. It's what the packet timestamps are in, and also pkt_pts on the output frame (since it's just copied from the corresponding packet).

  • the codec timebase is (if set at all) just the inverse of the framerate that might be written in the codec-level headers. It can be useful in cases where there is no container timing information (e.g. when you're reading raw video), but otherwise can be safely ignored.

  • AVFrame.pkt_pts is the timestamp of the packet that got decoded into this frame. As already said, it's just a straight copy from the packet, so it's in the stream timebase. This is the field you want to use (if the container has timestamps).

  • AVFrame.pts is not ever set to anything useful when decoding, ignore it (it might replace pkt_pts in the future, to make the whole mess less confusing, but for now it's like this, for historical reasons mostly).

  • the format context's duration is in AV_TIME_BASE (i.e. microseconds). It cannot be in any stream timebase, since you can have three bazillion streams, each with its own timebase.

  • the problem you see with getting a different timestamp after seeking is simply that seeking is not accurate. In most cases you can only seek to closest keyframe, so it's common to be a couple seconds off. Decoding and discarding the frames you don't need must be done manually.

这篇关于使用libav(ffmpeg)进行框架寻找/阅读时使用时间戳/时间段有什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆