在libav中读取dumepd RTP流 [英] Read dumepd RTP stream in libav

查看:111
本文介绍了在libav中读取dumepd RTP流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一些帮助/指导,因为我被困在研究中了.

问题:

如何在API(通过编程)或控制台版本中使用gstreamer或av​​lib(ffmpeg)转换RTP数据.

数据

我有RTP转储,它来自TCP上的RTP/RTCP,因此我可以为文件中的每个RTP数据包获取准确的开始和停止信息.这是H264视频流转储. 数据采用这种方式,因为我需要通过libcurl获取RTCP/RTP交错流(我目前正在这样做)

状态

我尝试使用ffmpeg消耗纯RTP数据包,但似乎通过控制台或编程使用rtp涉及启动" ffmpeg中的整个rtsp/rtp会话业务.我已经停在那儿了,暂时我并没有更深入地探索这条路.我想这可以通过爱好者级RTP API来实现,例如ff_rtp_parse_packet()我对这个lib太陌生,无法直接做到这一点.

然后有一个gstreamer,它无需编程即可具有更多的功能,但是暂时我不知道如何传递我拥有的RTP转储.

我还尝试了一些技巧,并通过 socat/nc 将转储流式传输到udp端口,并通过ffplay侦听,并以sdp文件作为输入,似乎至少有一些进展,rtp至少可以被识别,但是对于 socat ,有很多数据包丢失(数据发送速度太快了吗?),最终数据没有可视化.当我使用 nc 时,视频变形严重,但至少没有太多接收错误.

一种或另一种方式无法正确显示数据.

我知道我可以手动"对数据进行解包,但是这样做的想法是通过某种库来完成,因为最后还会有第二个音频流,必须与视频一起混合. /p>

对于解决此问题的任何帮助,我将不胜感激. 谢谢.

解决方案

最后一段时间后,我有时间再次坐下来讨论这个问题,最后我得到了令我满意的解决方案.我继续使用RTP交错流(RTP通过单个TCP连接与RTCP交错).
因此,我有一个交错的RTCP/RTP流,需要将其分解为音频(PCM A-Law)和视频(h.264约束基线)RTP数据包.
rfc2326 中描述了包含RTP数据的RTSP流的分解. .
HPCM的解包在此处 rfc6184 中进行了描述, -法律规定这些帧是RTP中的原始音频,因此不需要拆包.

下一步是为每个流计算适当的PTS(或演示时间戳),这有点麻烦,但最终通过Live555代码来提供帮助 (请参见 RTP唇形同步).
最后一个任务是将其混合到一个支持PCM alaw的容器中,我已经使用了ffmpeg的辅助工具.
互联网上有很多示例,但其中许多已过时(ffmpeg在API更改区域非常动态"),因此我要发布(其中最重要的部分)最终对我有效的内容:

设置部分:

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include "libavutil/intreadwrite.h"
#include "libavutil/mathematics.h"

AVFormatContext   *formatContext;
AVOutputFormat    *outputFormat;
AVStream          *video_st;
AVStream          *audio_st;
AVCodec           *av_encode_codec = NULL;
AVCodec           *av_audio_encode_codec = NULL;
AVCodecContext    *av_video_encode_codec_ctx = NULL;
AVCodecContext    *av_audio_encode_codec_ctx = NULL;


av_register_all();
av_log_set_level(AV_LOG_TRACE);
outputFormat = av_guess_format(NULL, pu8outFileName, NULL);
outputFormat->video_codec = AV_CODEC_ID_H264;

av_encode_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
av_audio_encode_codec = avcodec_find_encoder(AV_CODEC_ID_PCM_ALAW);
avformat_alloc_output_context2(&formatContext, NULL, NULL, pu8outFileName);
formatContext->oformat = outputFormat;
strcpy(formatContext->filename, pu8outFileName);
outputFormat->audio_codec  = AV_CODEC_ID_PCM_ALAW;

av_video_encode_codec_ctx = avcodec_alloc_context3(av_encode_codec);
av_audio_encode_codec_ctx = avcodec_alloc_context3(av_audio_encode_codec);

av_video_encode_codec_ctx->codec_id = outputFormat->video_codec;
av_video_encode_codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
av_video_encode_codec_ctx->bit_rate = 4000;
av_video_encode_codec_ctx->width  = u32width;
av_video_encode_codec_ctx->height = u32height;
av_video_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_video_encode_codec_ctx->max_b_frames = 0;
av_video_encode_codec_ctx->pix_fmt = AV_PIX_FMT_YUV420P;

av_audio_encode_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16;
av_audio_encode_codec_ctx->codec_id = AV_CODEC_ID_PCM_ALAW; 
av_audio_encode_codec_ctx->codec_type = AVMEDIA_TYPE_AUDIO;
av_audio_encode_codec_ctx->sample_rate = 8000;
av_audio_encode_codec_ctx->channels = 1;
av_audio_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_audio_encode_codec_ctx->channel_layout = AV_CH_LAYOUT_MONO;

video_st = avformat_new_stream(formatContext, av_encode_codec);
audio_st = avformat_new_stream(formatContext, av_audio_encode_codec);
audio_st->index = 1;
video_st->avg_frame_rate = (AVRational){ 90000, 90000 / u8fps };
av_stream_set_r_frame_rate(video_st, (AVRational){ 90000, 90000 / u8fps });

视频数据包的编写方式如下:

uint8_t  *pu8framePtr = video_frame;
AVPacket pkt = { 0 };
av_init_packet(&pkt);
if (0x65 == pu8framePtr[4] || 0x67 == pu8framePtr[4] || 0x68 == pu8framePtr[4]) 
{
    pkt.flags = AV_PKT_FLAG_KEY;
}

pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32LastFrameSize;

pkt.pts  = av_rescale_q(s_video_sync.fSyncTime.tv_sec * 1000000 + s_video_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, video_st->time_base);
pkt.dts  = pkt.pts;
pkt.stream_index = video_st->index;
av_interleaved_write_frame(formatContext, &pkt);
av_packet_unref(&pkt);

,对于这样的音频:

AVPacket pkt = { 0 };
av_init_packet(&pkt);
pkt.flags = AV_PKT_FLAG_KEY;
pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32AudioDataLen;

pkt.pts  = av_rescale_q(s_audio_sync.fSyncTime.tv_sec * 1000000 + s_audio_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, audio_st->time_base);
pkt.dts  = pkt.pts;
pkt.stream_index = audio_st->index;
if (u8FirstIFrameFound) {av_interleaved_write_frame(formatContext, &pkt);}
av_packet_unref(&pkt)

,最后进行一些初始化:

av_write_trailer(formatContext);
av_dump_format(formatContext, 0, pu8outFileName, 1);
avcodec_free_context(&av_video_encode_codec_ctx);
avcodec_free_context(&av_audio_encode_codec_ctx);
avio_closep(&formatContext->pb);
avformat_free_context(formatContext);

Hi I am in a need of a bit of a help/guidance because I got stuck in my research.

The problem:

How to convert RTP data using either gstreamer or avlib (ffmpeg) in either API (by programming) or console versions.

Data

I have RTP dump that comes from RTP/RTCP over TCP so I can get the precise start and stop for each RTP packet in file. It's a H264 video stream dump. The data is in this fashion because I need to acquire the RTCP/RTP interleaved stream via libcurl (which I'm currently doing)

Status

I've tried to use ffmpeg to consume pure RTP packets but is seems that using rtp either by console or by programming involves "starting" the whole rtsp/rtp session business in ffmpeg. I've stopped there and for the time being I didn't pursue this avenue deeper. I guess this is possible with lover level RTP API like ff_rtp_parse_packet() I'm too new with this lib to do it straight out.

Then there is the gstreamer It has somewhat more capabilities to do it without programming, but for the time being I'm not able to figure out how to pass it the RTP dump I have.

I have also tried to do a little bit of a trickery and stream the dump via socat/nc to the udp port and listen on it via ffplay with sdp file as an input, there seems to be some progress the rtp at least gets recognized, but for socat there are loads of packet missing (data sent too fast perhaps?) and in the end the data is not visualized. When I used nc the video was badly misshapen but at least there were not that much receive errors.

One way or another the data is not properly visualized.

I know I can depacketize the data "by hand" but the idea is to do it via some kind of library because in the end there would also be second stream with audio that would have to be muxed together with the video.

I would appreciate any help on how to tackle this problem. Thanks.

解决方案

Finally after some period of time I had time to sit down at this problem again, and finally I've got the solution that satisfies me. I went on with RTP interleaved stream (RTP is interleaved with RTCP over single TCP connection).
So I had a interleaved RTCP/RTP stream that needed to be disassembled to Audio (PCM A-Law) and Video (h.264 Constrained baseline) RTP packets.
The decomposition of the RTSP stream containing RTP data is described here rfc2326.
Depacketization of the H264 is described here rfc6184, for the PCM A-Law the frames came out to be raw audio in RTP so no depacketization was necessary.

Next step was to calculate proper PTS (or presentation time stamp) for each stream, that was a bit of a hassle but finally the Live555 code came to help (see RTP lipsync synchronization).
The last task was to mux it into a container that would support PCM alaw, I've used ffmpeg's avlibraries.
There are many examples over the Internet but many of them are outdated (ffmpeg is very 'dynamic' in API changes region) so I'm posting (most important parts of) what actually worked for me in the end:

The setup part:

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include "libavutil/intreadwrite.h"
#include "libavutil/mathematics.h"

AVFormatContext   *formatContext;
AVOutputFormat    *outputFormat;
AVStream          *video_st;
AVStream          *audio_st;
AVCodec           *av_encode_codec = NULL;
AVCodec           *av_audio_encode_codec = NULL;
AVCodecContext    *av_video_encode_codec_ctx = NULL;
AVCodecContext    *av_audio_encode_codec_ctx = NULL;


av_register_all();
av_log_set_level(AV_LOG_TRACE);
outputFormat = av_guess_format(NULL, pu8outFileName, NULL);
outputFormat->video_codec = AV_CODEC_ID_H264;

av_encode_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
av_audio_encode_codec = avcodec_find_encoder(AV_CODEC_ID_PCM_ALAW);
avformat_alloc_output_context2(&formatContext, NULL, NULL, pu8outFileName);
formatContext->oformat = outputFormat;
strcpy(formatContext->filename, pu8outFileName);
outputFormat->audio_codec  = AV_CODEC_ID_PCM_ALAW;

av_video_encode_codec_ctx = avcodec_alloc_context3(av_encode_codec);
av_audio_encode_codec_ctx = avcodec_alloc_context3(av_audio_encode_codec);

av_video_encode_codec_ctx->codec_id = outputFormat->video_codec;
av_video_encode_codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
av_video_encode_codec_ctx->bit_rate = 4000;
av_video_encode_codec_ctx->width  = u32width;
av_video_encode_codec_ctx->height = u32height;
av_video_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_video_encode_codec_ctx->max_b_frames = 0;
av_video_encode_codec_ctx->pix_fmt = AV_PIX_FMT_YUV420P;

av_audio_encode_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16;
av_audio_encode_codec_ctx->codec_id = AV_CODEC_ID_PCM_ALAW; 
av_audio_encode_codec_ctx->codec_type = AVMEDIA_TYPE_AUDIO;
av_audio_encode_codec_ctx->sample_rate = 8000;
av_audio_encode_codec_ctx->channels = 1;
av_audio_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_audio_encode_codec_ctx->channel_layout = AV_CH_LAYOUT_MONO;

video_st = avformat_new_stream(formatContext, av_encode_codec);
audio_st = avformat_new_stream(formatContext, av_audio_encode_codec);
audio_st->index = 1;
video_st->avg_frame_rate = (AVRational){ 90000, 90000 / u8fps };
av_stream_set_r_frame_rate(video_st, (AVRational){ 90000, 90000 / u8fps });

The packets for video are written like this:

uint8_t  *pu8framePtr = video_frame;
AVPacket pkt = { 0 };
av_init_packet(&pkt);
if (0x65 == pu8framePtr[4] || 0x67 == pu8framePtr[4] || 0x68 == pu8framePtr[4]) 
{
    pkt.flags = AV_PKT_FLAG_KEY;
}

pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32LastFrameSize;

pkt.pts  = av_rescale_q(s_video_sync.fSyncTime.tv_sec * 1000000 + s_video_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, video_st->time_base);
pkt.dts  = pkt.pts;
pkt.stream_index = video_st->index;
av_interleaved_write_frame(formatContext, &pkt);
av_packet_unref(&pkt);

and for the audio like this:

AVPacket pkt = { 0 };
av_init_packet(&pkt);
pkt.flags = AV_PKT_FLAG_KEY;
pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32AudioDataLen;

pkt.pts  = av_rescale_q(s_audio_sync.fSyncTime.tv_sec * 1000000 + s_audio_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, audio_st->time_base);
pkt.dts  = pkt.pts;
pkt.stream_index = audio_st->index;
if (u8FirstIFrameFound) {av_interleaved_write_frame(formatContext, &pkt);}
av_packet_unref(&pkt)

and at the end some deinits:

av_write_trailer(formatContext);
av_dump_format(formatContext, 0, pu8outFileName, 1);
avcodec_free_context(&av_video_encode_codec_ctx);
avcodec_free_context(&av_audio_encode_codec_ctx);
avio_closep(&formatContext->pb);
avformat_free_context(formatContext);

这篇关于在libav中读取dumepd RTP流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆