使用ffmpeg解码H264视频的问题(libavcodec) [英] Problem to Decode H264 video over RTP with ffmpeg (libavcodec)

查看:1033
本文介绍了使用ffmpeg解码H264视频的问题(libavcodec)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用SDP的profile-level-id和sprop-parameter-set设置AvCodecContext的profile_idc,level_idc,extradata和extradata_size。

SPS编码片,SPS,PPS和NAL_IDR_SLICE包:



Init:



uint8_t start_sequence [] = {0 ,0,1};
int size = recv(id_de_la_socket,(char *)rtpReceive,65535,0);



编码切片:



  char * z = new char [size-16 + sizeof(start_sequence)]; 
memcpy(z,& start_sequence,sizeof(start_sequence));
memcpy(z + sizeof(start_sequence),rtpReceive + 16,size-16);
ConsumedBytes = avcodec_decode_video(codecContext,pFrame,& GotPicture,(uint8_t *)z,size-16 + sizeof(start_sequence));
delete z;

结果:ConsumedBytes> 0 and GotPicture> 0(经常)



SPS和PPS:



相同的代码。
结果:ConsumedBytes> 0和GotPicture = 0



这是正常的我认为





NAL_IDR_SLICE:







Nal单元类型是28 => idr Frame是分段的,因此我尝试两种方法来解码



使用序列0x000001并将其发送到avcodec_decode_video。



2)我用序列0x000001前缀第一个片段(没有RTP头),并将其余的片段连接到它。



在这两种情况下,我没有错误(ConsumedBytes> 0),但我没有检测到帧(GotPicture = 0)...

问题是什么?

解决方案

在RTP所有H264 I- (IDR)通常是分段的。当你收到RTP首先必须跳过标题(通常首先12个字节),然后到达NAL单元(第一个有效载荷字节)。如果NAL是28(1C),那么它意味着以下有效载荷表示一个H264 IDR(I帧)片段,并且需要收集它们以重建H264 IDR(I帧)。



由于MTU有限,IDR大得多,发生分段。一个片段可能如下所示:



具有START BIT = 1的片段

 第一个字节:[3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
第二个字节:[START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS]
其他字节:[... IDR FRAGMENT DATA ...]

其他片段

 第一个字节:[3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
其他字节:[... IDR FRAGMENT DATA ...]

要重建IDR,您必须收集此信息:

  int fragment_type = Data [0]& 0x1F; 
int nal_type = Data [1]& 0x1F;
int start_bit = Data [1]& 0x80;
int end_bit = Data [1]& 0x40;

如果 fragment_type == 28 它是IDR的一个片段。下一个检查是 start_bit 设置,如果是,那么该片段是序列中的第一个。通过从第一个有效负载字节(3 NAL UNIT BITS)中取出前3个位并将它们与第二个有效负载字节的最后5位结合,可以使用它来重构IDR的NAL字节(5 NAL UNIT BITS)所以你会得到一个这样的字节 [3 NAL UNIT BITS | 5 NAL UNIT BITS] 。然后将该NAL字节首先写入一个清除缓冲区,该片段中的所有其他后续字节。记住要跳过序列中的第一个字节,因为它不是IDR的一部分,但只标识片段。



如果 start_bit end_bit 都为0,那么只需将有效载荷(跳过标识片段的第一个有效载荷字节)写入缓冲区。



如果start_bit是0,end_bit是1,那意味着它是最后一个片段,你只需将其有效载荷(跳过标识片段的第一个字节)写入缓冲区,现在就可以重建IDR。



如果你需要一些代码,只需提出评论,我会发布,但我认为这是很清楚如何做... =)



关于解码



(我推测你已经重建好了)。如何构建AVC解码器配置记录?你使用的库是否自动化?如果没有,并且你没有听说过,继续阅读...



AVCDCR被指定为允许解码器快速解析他们需要解码的所有数据H264(AVC)视频流。数据如下:




  • ProfileIDC

  • ProfileIOP

  • LevelIDC

  • SPS(序列参数集)

  • PPS(图片参数集)



所有这些数据在SDP中的RTSP会话中在以下字段下发送: profile-level-id sprop -parameter-sets



DECODING PROFILE-LEVEL-ID



Prifile级别ID字符串分为3个子字符串,每个2个字符长:



[PROFILE IDC] [PROFILE IOP] LEVEL IDC]



每个子字符串表示base16中的一个字节!所以,如果Profile IDC是28,这意味着它实际上是base10中的40。稍后您将使用base10值来构建AVC解码器配置记录。



DECODING SPROP-PARAMETER-SETS



Sprops通常是逗号分隔的2个字符串(可能更多),以及 base64编码!你可以解码它们,但没有必要。您在这里的工作只是将它们从base64字符串转换为字节数组供以后使用。现在您有两个字节数组,第一个数组是SPS,第二个是PPS。



构建AVCDCR



现在,你需要构建AVCDCR,你需要创建一个新的干净的缓冲区,现在按照这里解释的顺序写这些东西:



1 - 值为 1 且表示版本的字节



2 - 个人资料IDC字节



3 - Prifile IOP字节



4 - 级别IDC字节



5 - 0xFF(google的AVC解码器配置记录,看看是什么)



6 - 值为0xE1的字节



7 - SPS数组长度的值较小



8 - SPS字节数组



9 - PPS数组的数量(在sprop-parameter-set中可以有更多的数组)



10 - 缩写PPS数组的长度



11 - PPS数组



DECODING VIDEO STREAM



现在你有字节数组,告诉解码器如何解码H264视频流。我相信你需要这个,如果你的lib不是从SDP ...


构建它自己

I set profile_idc, level_idc, extradata et extradata_size of AvCodecContext with the profile-level-id et sprop-parameter-set of the SDP.

I separate the decoding of Coded Slice, SPS, PPS and NAL_IDR_SLICE packet :

Init:

uint8_t start_sequence[]= {0, 0, 1}; int size= recv(id_de_la_socket,(char*) rtpReceive,65535,0);

Coded Slice :

char *z = new char[size-16+sizeof(start_sequence)];
    memcpy(z,&start_sequence,sizeof(start_sequence));
    memcpy(z+sizeof(start_sequence),rtpReceive+16,size-16);
    ConsumedBytes = avcodec_decode_video(codecContext,pFrame,&GotPicture,(uint8_t*)z,size-16+sizeof(start_sequence));
    delete z;

Result: ConsumedBytes >0 and GotPicture >0 (often)

SPS and PPS :

identical code. Result: ConsumedBytes >0 and GotPicture =0

It's normal I think

When I find a new couple SPS/PPS, I update extradata and extrada_size with the payloads of this packet and their size.

NAL_IDR_SLICE :

The Nal unit type is 28 =>idr Frame are fragmented therefor I tryed two method to decode

1) I prefix the first fragment (without RTP header) with the sequence 0x000001 and send it to avcodec_decode_video. Then I send the rest of fragments to this function.

2) I prefix the first fragment (without RTP header) with the sequence 0x000001 and concatenate the rest of fragments to it. I send this buffer to decoder.

In both cases, I have no error (ConsumedBytes >0) but I detect no frame (GotPicture = 0) ...

What is the problem ?

解决方案

In RTP all H264 I-Frames (IDRs) are usualy fragmented. When you receive RTP you first must skip the header (usualy first 12 bytes) and then get to the NAL unit (first payload byte). If the NAL is 28 (1C) then it means that following payload represents one H264 IDR (I-Frame) fragment and that you need to collect all of them to reconstruct H264 IDR (I-Frame).

Fragmentation occurs because of the limited MTU, and much larger IDR. One fragment can look like this:

Fragment that has START BIT = 1:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
Second byte: [ START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS] 
Other bytes: [... IDR FRAGMENT DATA...]

Other fragments:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS]  
Other bytes: [... IDR FRAGMENT DATA...]

To reconstruct IDR you must collect this info:

int fragment_type = Data[0] & 0x1F;
int nal_type = Data[1] & 0x1F;
int start_bit = Data[1] & 0x80;
int end_bit = Data[1] & 0x40;

If fragment_type == 28 then payload following it is one fragment of IDR. Next check is start_bit set, if it is, then that fragment is the first one in a sequence. You use it to reconstruct IDR's NAL byte by taking the first 3 bits from first payload byte (3 NAL UNIT BITS) and combine them with last 5 bits from second payload byte (5 NAL UNIT BITS) so you would get a byte like this [3 NAL UNIT BITS | 5 NAL UNIT BITS]. Then write that NAL byte first into a clear buffer with all other following bytes from that fragment. Remember to skip first byte in a sequence since it is not a part of IDR, but only identifies the fragment.

If start_bit and end_bit are 0 then just write the payload (skipping first payload byte that identifies the fragment) to the buffer.

If start_bit is 0 and end_bit is 1, that means that it is the last fragment, and you just write its payload (skipping the first byte that identifies the fragment) to the buffer, and now you have your IDR reconstructed.

If you need some code, just ask in comment, I'll post it, but I think this is pretty clear how to do... =)

CONCERNING THE DECODING

It crossed my mind today why you get error on decoding the IDR (I presumed that you have reconstructed it good). How are you building your AVC Decoder Configuration Record? Does the lib that you use have that automated? If not, and you havent heard of this, continue reading...

AVCDCR is specified to allow decoders to quickly parse all the data they need to decode H264 (AVC) video stream. And the data is following:

  • ProfileIDC
  • ProfileIOP
  • LevelIDC
  • SPS (Sequence Parameter Sets)
  • PPS (Picture Parameter Sets)

All this data is sent in RTSP session in SDP under the fields: profile-level-id and sprop-parameter-sets.

DECODING PROFILE-LEVEL-ID

Prifile level ID string is divided into 3 substrings, each 2 characters long:

[PROFILE IDC][PROFILE IOP][LEVEL IDC]

Each substring represents one byte in base16! So, if Profile IDC is 28, that means it is actualy 40 in base10. Later you will use base10 values to construct AVC Decoder Configuration Record.

DECODING SPROP-PARAMETER-SETS

Sprops are usualy 2 strings (could be more) that are comma separated, and base64 encoded! You can decode both of them but there is no need to. Your job here is just to convert them from base64 string into byte array for later use. Now you have 2 byte arrays, first array us SPS, second one is PPS.

BUILDING THE AVCDCR

Now, you have all you need to build AVCDCR, you start by making new clean buffer, now write these things in it in the order explained here:

1 - Byte that has value 1 and represents version

2 - Profile IDC byte

3 - Prifile IOP byte

4 - Level IDC byte

5 - Byte with value 0xFF (google the AVC Decoder Configuration Record to see what this is)

6 - Byte with value 0xE1

7 - Short with value of the SPS array length

8 - SPS byte array

9 - Byte with the number of PPS arrays (you could have more of them in sprop-parameter-set)

10 - Short with the length of following PPS array

11 - PPS array

DECODING VIDEO STREAM

Now you have byte array that tells the decoder how to decode H264 video stream. I believe that you need this if your lib doesn't build it itself from SDP...

这篇关于使用ffmpeg解码H264视频的问题(libavcodec)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆