FFMPEG。读取帧,处理它,把它输出到视频。复制声音流不变 [英] FFMPEG. Read frame, process it, put it to output video. Copy sound stream unchanged

查看:260
本文介绍了FFMPEG。读取帧,处理它,把它输出到视频。复制声音流不变的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对具有声轨的视频片段应用处理,逐帧提取和处理,并将结果写入输出文件。输出片段中的帧数,帧的大小和速度保持不变。另外我想要保留与源文件相同的音轨。

I want to apply processing to a video clip with sound track, extract and process frame by frame and write result to output file. Number of frames, size of frame and speed remains unchanged in output clip. Also I want to keep the same audio track as I have in source.

我可以读取剪辑,解码帧,然后使用opencv进行处理。音频数据包也写得很好。我被困在形成输出视频流。

I can read clip, decode frames and process then using opencv. Audio packets are also writes fine. I'm stuck on forming output video stream.

我现在拥有的最小的可运行代码(抱歉不是很短,但不能缩短):

The minimal runnable code I have for now (sorry it not so short, but cant do it shorter):

extern "C" {
#include <libavutil/timestamp.h>
#include <libavformat/avformat.h>
#include "libavcodec/avcodec.h"
#include <libavutil/opt.h>
#include <libavdevice/avdevice.h>
#include <libswscale/swscale.h>
}
#include "opencv2/opencv.hpp"

#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(55,28,1)
#define av_frame_alloc  avcodec_alloc_frame
#endif

using namespace std;
using namespace cv;

static void log_packet(const AVFormatContext *fmt_ctx, const AVPacket *pkt, const char *tag)
{
    AVRational *time_base = &fmt_ctx->streams[pkt->stream_index]->time_base;

    char buf1[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_string(buf1, pkt->pts);
    char buf2[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_string(buf1, pkt->dts);
    char buf3[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_string(buf1, pkt->duration);

    char buf4[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_time_string(buf1, pkt->pts, time_base);
    char buf5[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_time_string(buf1, pkt->dts, time_base);
    char buf6[AV_TS_MAX_STRING_SIZE] = { 0 };
    av_ts_make_time_string(buf1, pkt->duration, time_base);

    printf("pts:%s pts_time:%s dts:%s dts_time:%s duration:%s duration_time:%s stream_index:%d\n",
        buf1, buf4,
        buf2, buf5,
        buf3, buf6,
        pkt->stream_index);

}


int main(int argc, char **argv)
{
    AVOutputFormat *ofmt = NULL;
    AVFormatContext *ifmt_ctx = NULL, *ofmt_ctx = NULL;
    AVPacket pkt;
    AVFrame *pFrame = NULL;
    AVFrame *pFrameRGB = NULL;
    int frameFinished = 0;
    pFrame = av_frame_alloc();
    pFrameRGB = av_frame_alloc();

    const char *in_filename, *out_filename;
    int ret, i;
    in_filename = "../../TestClips/Audio Video Sync Test.mp4";
    out_filename = "out.mp4";

    // Initialize FFMPEG
    av_register_all();
    // Get input file format context
    if ((ret = avformat_open_input(&ifmt_ctx, in_filename, 0, 0)) < 0)
    {
        fprintf(stderr, "Could not open input file '%s'", in_filename);
        goto end;
    }
    // Extract streams description
    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0)
    {
        fprintf(stderr, "Failed to retrieve input stream information");
        goto end;
    }
    // Print detailed information about the input or output format,
    // such as duration, bitrate, streams, container, programs, metadata, side data, codec and time base.
    av_dump_format(ifmt_ctx, 0, in_filename, 0);

    // Allocate an AVFormatContext for an output format.
    avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, out_filename);
    if (!ofmt_ctx)
    {
        fprintf(stderr, "Could not create output context\n");
        ret = AVERROR_UNKNOWN;
        goto end;
    }

    // The output container format.
    ofmt = ofmt_ctx->oformat;

    // Allocating output streams
    for (i = 0; i < ifmt_ctx->nb_streams; i++)
    {
        AVStream *in_stream = ifmt_ctx->streams[i];
        AVStream *out_stream = avformat_new_stream(ofmt_ctx, in_stream->codec->codec);
        if (!out_stream)
        {
            fprintf(stderr, "Failed allocating output stream\n");
            ret = AVERROR_UNKNOWN;
            goto end;
        }
        ret = avcodec_copy_context(out_stream->codec, in_stream->codec);
        if (ret < 0)
        {
            fprintf(stderr, "Failed to copy context from input to output stream codec context\n");
            goto end;
        }
        out_stream->codec->codec_tag = 0;
        if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
        {
            out_stream->codec->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
        }
    }

    // Show output format info
    av_dump_format(ofmt_ctx, 0, out_filename, 1);

    // Open output file
    if (!(ofmt->flags & AVFMT_NOFILE))
    {
        ret = avio_open(&ofmt_ctx->pb, out_filename, AVIO_FLAG_WRITE);
        if (ret < 0)
        {
            fprintf(stderr, "Could not open output file '%s'", out_filename);
            goto end;
        }
    }
    // Write output file header
    ret = avformat_write_header(ofmt_ctx, NULL);
    if (ret < 0)
    {
        fprintf(stderr, "Error occurred when opening output file\n");
        goto end;
    }

    // Search for input video codec info
    AVCodec *in_codec = nullptr;
    AVCodecContext* avctx = nullptr;

    int video_stream_index = -1;
    for (int i = 0; i < ifmt_ctx->nb_streams; i++)
    {
        if (ifmt_ctx->streams[i]->codec->coder_type == AVMEDIA_TYPE_VIDEO)
        {
            video_stream_index = i;
            avctx = ifmt_ctx->streams[i]->codec;
            in_codec = avcodec_find_decoder(avctx->codec_id);
            if (!in_codec)
            {
                fprintf(stderr, "in codec not found\n");
                exit(1);
            }
            break;
        }
    }

    // Search for output video codec info
    AVCodec *out_codec = nullptr;
    AVCodecContext* o_avctx = nullptr;

    int o_video_stream_index = -1;
    for (int i = 0; i < ofmt_ctx->nb_streams; i++)
    {
        if (ofmt_ctx->streams[i]->codec->coder_type == AVMEDIA_TYPE_VIDEO)
        {
            o_video_stream_index = i;
            o_avctx = ofmt_ctx->streams[i]->codec;
            out_codec = avcodec_find_encoder(o_avctx->codec_id);
            if (!out_codec)
            {
                fprintf(stderr, "out codec not found\n");
                exit(1);
            }
            break;
        }
    }

    // openCV pixel format
    AVPixelFormat pFormat = AV_PIX_FMT_RGB24;
    // Data size
    int numBytes = avpicture_get_size(pFormat, avctx->width, avctx->height);
    // allocate buffer 
    uint8_t *buffer = (uint8_t *)av_malloc(numBytes * sizeof(uint8_t));
    // fill frame structure
    avpicture_fill((AVPicture *)pFrameRGB, buffer, pFormat, avctx->width, avctx->height);
    // frame area
    int y_size = avctx->width * avctx->height;
    // Open input codec
    avcodec_open2(avctx, in_codec, NULL);
    // Main loop
    while (1)
    {
        AVStream *in_stream, *out_stream;
        ret = av_read_frame(ifmt_ctx, &pkt);
        if (ret < 0)
        {
            break;
        }
        in_stream = ifmt_ctx->streams[pkt.stream_index];
        out_stream = ofmt_ctx->streams[pkt.stream_index];
        log_packet(ifmt_ctx, &pkt, "in");
        // copy packet 
        pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream->time_base, out_stream->time_base, AVRounding(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
        pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream->time_base, out_stream->time_base, AVRounding(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
        pkt.duration = av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
        pkt.pos = -1;

        log_packet(ofmt_ctx, &pkt, "out");
        if (pkt.stream_index == video_stream_index)
        {
            avcodec_decode_video2(avctx, pFrame, &frameFinished, &pkt);
            if (frameFinished)
            {
                struct SwsContext *img_convert_ctx;
                img_convert_ctx = sws_getCachedContext(NULL,
                    avctx->width,
                    avctx->height,
                    avctx->pix_fmt,
                    avctx->width,
                    avctx->height,
                    AV_PIX_FMT_BGR24,
                    SWS_BICUBIC,
                    NULL,
                    NULL,
                    NULL);
                sws_scale(img_convert_ctx,
                    ((AVPicture*)pFrame)->data,
                    ((AVPicture*)pFrame)->linesize,
                    0,
                    avctx->height,
                    ((AVPicture *)pFrameRGB)->data,
                    ((AVPicture *)pFrameRGB)->linesize);

                sws_freeContext(img_convert_ctx);

                // Do some image processing
                cv::Mat img(pFrame->height, pFrame->width, CV_8UC3, pFrameRGB->data[0],false);
                cv::GaussianBlur(img,img,Size(5,5),3);
                cv::imshow("Display", img);
                cv::waitKey(5);
                // --------------------------------
                // Transform back to initial format
                // --------------------------------
                img_convert_ctx = sws_getCachedContext(NULL,
                    avctx->width,
                    avctx->height,
                    AV_PIX_FMT_BGR24,
                    avctx->width,
                    avctx->height,
                    avctx->pix_fmt,
                    SWS_BICUBIC,
                    NULL,
                    NULL,
                    NULL);
                sws_scale(img_convert_ctx,
                    ((AVPicture*)pFrameRGB)->data,
                    ((AVPicture*)pFrameRGB)->linesize,
                    0,
                    avctx->height,
                    ((AVPicture *)pFrame)->data,
                    ((AVPicture *)pFrame)->linesize);
                    // --------------------------------------------
                    // Something must be here
                    // --------------------------------------------
                    //
                    // Write fideo frame (How to write frame to output stream ?)
                    //
                    // --------------------------------------------
                     sws_freeContext(img_convert_ctx);
            }

        }
        else // write sound frame
        {
            ret = av_interleaved_write_frame(ofmt_ctx, &pkt);
        }
        if (ret < 0)
        {
            fprintf(stderr, "Error muxing packet\n");
            break;
        }
        // Decrease packet ref counter
        av_packet_unref(&pkt);
    }
    av_write_trailer(ofmt_ctx);
end:
    avformat_close_input(&ifmt_ctx);
    // close output 
    if (ofmt_ctx && !(ofmt->flags & AVFMT_NOFILE))
    {
        avio_closep(&ofmt_ctx->pb);
    }
    avformat_free_context(ofmt_ctx);
    if (ret < 0 && ret != AVERROR_EOF)
    {
        char buf_err[AV_ERROR_MAX_STRING_SIZE] = { 0 };
        av_make_error_string(buf_err, AV_ERROR_MAX_STRING_SIZE, ret);
        fprintf(stderr, "Error occurred: %s\n", buf_err);
        return 1;
    }

    avcodec_close(avctx);
    av_free(pFrame);
    av_free(pFrameRGB);

    return 0;
}


推荐答案

您的原始代码segfaults在我的案件。初始化输出编解码器上下文似乎修复它。以下代码适用于我,但是我没有测试OpenCV的东西,因为我没有安装lib。

Your original code segfaults in my case. Initializing the output codec context seems to fix it. The code below works for me but I didn't test the OpenCV stuff as I don't have the lib installed.

获取编解码器上下文:

// Search for output video codec info
AVCodec *out_codec = NULL;
AVCodecContext* o_avctx = NULL;

int o_video_stream_index = -1;

for (int i = 0; i < ofmt_ctx->nb_streams; i++)
{
    if (ofmt_ctx->streams[i]->codec->coder_type == AVMEDIA_TYPE_VIDEO)
    {
        o_video_stream_index = i;        
        out_codec = avcodec_find_encoder(ofmt_ctx->streams[i]->codec->codec_id);
        o_avctx = avcodec_alloc_context3(out_codec);

        o_avctx->height = avctx->height;
        o_avctx->width = avctx->width;
        o_avctx->sample_aspect_ratio = avctx->sample_aspect_ratio;            
        if (out_codec->pix_fmts)
            o_avctx->pix_fmt = out_codec->pix_fmts[0];
        else
            o_avctx->pix_fmt = avctx->pix_fmt;
        o_avctx->time_base = avctx->time_base;

        avcodec_open2(o_avctx, out_codec, NULL);
    }
}

编码和写入:

// Main loop
while (1)
{

...

if (pkt.stream_index == video_stream_index)
{        
    avcodec_decode_video2(avctx, pFrame, &frameFinished, &pkt);

    if (frameFinished)
    {
        ...
        // --------------------------------------------
        // Something must be here
        // --------------------------------------------
        int got_packet = 0;
        AVPacket enc_pkt = { 0 };
        av_init_packet(&enc_pkt);        

        avcodec_encode_video2(o_avctx, &enc_pkt, pFrame, &got_packet);
        av_interleaved_write_frame(ofmt_ctx, &enc_pkt);

        ....

    }
}

这篇关于FFMPEG。读取帧,处理它,把它输出到视频。复制声音流不变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆