MP4碎片-在浏览器中播放时出现问题 [英] Fragmented MP4 - problem playing in browser

查看:156
本文介绍了MP4碎片-在浏览器中播放时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试从原始H264视频数据创建片段MP4,以便可以在Internet浏览器的播放器中播放它.我的目标是创建实时流式传输系统,在该系统中,媒体服务器会将碎片化的MP4片段发送到浏览器.服务器将缓冲来自RaspberryPi摄像机的输入数据,该摄像机以H264帧的形式发送视频.然后,它将多路复用该视频数据并将其提供给客户端.浏览器将使用Media Source Extensions播放媒体数据(由服务器混合并通过websocket发送).

I try to create fragmented MP4 from raw H264 video data so I could play it in internet browser's player. My goal is to create live streaming system, where media server would send fragmented MP4 pieces to browser. The server would buffer input data from RaspberryPi camera, which sends video as H264 frames. It would then mux that video data and make it available for client. The browser would play media data (that were muxed by server and sent i.e. through websocket) by using Media Source Extensions.

出于测试目的,我编写了以下代码(使用我在intenet中找到的许多示例):

For test purpose I wrote the following pieces of code (using many examples I found in the intenet):

使用avcodec的C ++应用程序,它将原始H264视频复用到片段MP4并将其保存到文件中:

#define READBUFSIZE 4096
#define IOBUFSIZE 4096
#define ERRMSGSIZE 128

#include <cstdint>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

extern "C"
{
    #include <libavformat/avformat.h>
    #include <libavutil/error.h>
    #include <libavutil/opt.h>
}

enum NalType : uint8_t
{
    //NALs containing stream metadata
    SEQ_PARAM_SET = 0x7,
    PIC_PARAM_SET = 0x8
};

std::vector<uint8_t> outputData;

int mediaMuxCallback(void *opaque, uint8_t *buf, int bufSize)
{
    outputData.insert(outputData.end(), buf, buf + bufSize);
    return bufSize;
}

std::string getAvErrorString(int errNr)
{
    char errMsg[ERRMSGSIZE];
    av_strerror(errNr, errMsg, ERRMSGSIZE);
    return std::string(errMsg);
}

int main(int argc, char **argv)
{
    if(argc < 2)
    {
        std::cout << "Missing file name" << std::endl;
        return 1;
    }

    std::fstream file(argv[1], std::ios::in | std::ios::binary);
    if(!file.is_open())
    {
        std::cout << "Couldn't open file " << argv[1] << std::endl;
        return 2;
    }

    std::vector<uint8_t> inputMediaData;
    do
    {
        char buf[READBUFSIZE];
        file.read(buf, READBUFSIZE);

        int size = file.gcount();
        if(size > 0)
            inputMediaData.insert(inputMediaData.end(), buf, buf + size);
    } while(!file.eof());
    file.close();

    //Initialize avcodec
    av_register_all();
    uint8_t *ioBuffer;
    AVCodec *codec = avcodec_find_decoder(AV_CODEC_ID_H264);
    AVCodecContext *codecCtxt = avcodec_alloc_context3(codec);
    AVCodecParserContext *parserCtxt = av_parser_init(AV_CODEC_ID_H264);
    AVOutputFormat *outputFormat = av_guess_format("mp4", nullptr, nullptr);
    AVFormatContext *formatCtxt;
    AVIOContext *ioCtxt;
    AVStream *videoStream;

    int res = avformat_alloc_output_context2(&formatCtxt, outputFormat, nullptr, nullptr);
    if(res < 0)
    {
        std::cout << "Couldn't initialize format context; the error was: " << getAvErrorString(res) << std::endl;
        return 3;
    }

    if((videoStream = avformat_new_stream( formatCtxt, avcodec_find_encoder(formatCtxt->oformat->video_codec) )) == nullptr)
    {
        std::cout << "Couldn't initialize video stream" << std::endl;
        return 4;
    }
    else if(!codec)
    {
        std::cout << "Couldn't initialize codec" << std::endl;
        return 5;
    }
    else if(codecCtxt == nullptr)
    {
        std::cout << "Couldn't initialize codec context" << std::endl;
        return 6;
    }
    else if(parserCtxt == nullptr)
    {
        std::cout << "Couldn't initialize parser context" << std::endl;
        return 7;
    }
    else if((ioBuffer = (uint8_t*)av_malloc(IOBUFSIZE)) == nullptr)
    {
        std::cout << "Couldn't allocate I/O buffer" << std::endl;
        return 8;
    }
    else if((ioCtxt = avio_alloc_context(ioBuffer, IOBUFSIZE, 1, nullptr, nullptr, mediaMuxCallback, nullptr)) == nullptr)
    {
        std::cout << "Couldn't initialize I/O context" << std::endl;
        return 9;
    }

    //Set video stream data
    videoStream->id = formatCtxt->nb_streams - 1;
    videoStream->codec->width = 1280;
    videoStream->codec->height = 720;
    videoStream->time_base.den = 60; //FPS
    videoStream->time_base.num = 1;
    videoStream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
    formatCtxt->pb = ioCtxt;

    //Retrieve SPS and PPS for codec extdata
    const uint32_t synchMarker = 0x01000000;
    unsigned int i = 0;
    int spsStart = -1, ppsStart = -1;
    uint16_t spsSize = 0, ppsSize = 0;
    while(spsSize == 0 || ppsSize == 0)
    {
        uint32_t *curr =  (uint32_t*)(inputMediaData.data() + i);
        if(*curr == synchMarker)
        {
            unsigned int currentNalStart = i;
            i += sizeof(uint32_t);
            uint8_t nalType = inputMediaData.data()[i] & 0x1F;
            if(nalType == SEQ_PARAM_SET)
                spsStart = currentNalStart;
            else if(nalType == PIC_PARAM_SET)
                ppsStart = currentNalStart;

            if(spsStart >= 0 && spsSize == 0 && spsStart != i)
                spsSize = currentNalStart - spsStart;
            else if(ppsStart >= 0 && ppsSize == 0 && ppsStart != i)
                ppsSize = currentNalStart - ppsStart;
        }
        ++i;
    }

    videoStream->codec->extradata = inputMediaData.data() + spsStart;
    videoStream->codec->extradata_size = ppsStart + ppsSize;

    //Write main header
    AVDictionary *options = nullptr;
    av_dict_set(&options, "movflags", "frag_custom+empty_moov", 0);
    res = avformat_write_header(formatCtxt, &options);
    if(res < 0)
    {
        std::cout << "Couldn't write container main header; the error was: " << getAvErrorString(res) << std::endl;
        return 10;
    }

    //Retrieve frames from input video and wrap them in container
    int currentInputIndex = 0;
    int framesInSecond = 0;
    while(currentInputIndex < inputMediaData.size())
    {
        uint8_t *frameBuffer;
        int frameSize;
        res = av_parser_parse2(parserCtxt, codecCtxt, &frameBuffer, &frameSize, inputMediaData.data() + currentInputIndex,
            inputMediaData.size() - currentInputIndex, AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0);
        if(frameSize == 0) //No more frames while some data still remains (is that even possible?)
        {
            std::cout << "Some data left unparsed: " << std::to_string(inputMediaData.size() - currentInputIndex) << std::endl;
            break;
        }

        //Prepare packet with video frame to be dumped into container
        AVPacket packet;
        av_init_packet(&packet);
        packet.data = frameBuffer;
        packet.size = frameSize;
        packet.stream_index = videoStream->index;
        currentInputIndex += frameSize;

        //Write packet to the video stream
        res = av_write_frame(formatCtxt, &packet);
        if(res < 0)
        {
            std::cout << "Couldn't write packet with video frame; the error was: " << getAvErrorString(res) << std::endl;
            return 11;
        }

        if(++framesInSecond == 60) //We want 1 segment per second
        {
            framesInSecond = 0;
            res = av_write_frame(formatCtxt, nullptr); //Flush segment
        }
    }
    res = av_write_frame(formatCtxt, nullptr); //Flush if something has been left

    //Write media data in container to file
    file.open("my_mp4.mp4", std::ios::out | std::ios::binary);
    if(!file.is_open())
    {
        std::cout << "Couldn't open output file " << std::endl;
        return 12;
    }

    file.write((char*)outputData.data(), outputData.size());
    if(file.fail())
    {
        std::cout << "Couldn't write to file" << std::endl;
        return 13;
    }

    std::cout << "Media file muxed successfully" << std::endl;
    return 0;
}

(我对一些值进行了硬编码,例如视频尺寸或帧速率,但正如我所说的,这只是测试代码.)

(I hardcoded a few values, such as video dimensions or framerate, but as I said this is just a test code.)

使用MSE播放我的MP4片段的简单HTML网页

<!DOCTYPE html>
<html>
<head>
    <title>Test strumienia</title>
</head>
<body>
    <video width="1280" height="720" controls>
    </video>
</body>
<script>
var vidElement = document.querySelector('video');

if (window.MediaSource) {
  var mediaSource = new MediaSource();
  vidElement.src = URL.createObjectURL(mediaSource);
  mediaSource.addEventListener('sourceopen', sourceOpen);
} else {
  console.log("The Media Source Extensions API is not supported.")
}

function sourceOpen(e) {
  URL.revokeObjectURL(vidElement.src);
  var mime = 'video/mp4; codecs="avc1.640028"';
  var mediaSource = e.target;
  var sourceBuffer = mediaSource.addSourceBuffer(mime);
  var videoUrl = 'my_mp4.mp4';
  fetch(videoUrl)
    .then(function(response) {
      return response.arrayBuffer();
    })
    .then(function(arrayBuffer) {
      sourceBuffer.addEventListener('updateend', function(e) {
        if (!sourceBuffer.updating && mediaSource.readyState === 'open') {
          mediaSource.endOfStream();
        }
      });
      sourceBuffer.appendBuffer(arrayBuffer);
    });
}
</script>
</html>


我的C ++应用程序生成的输出MP4文件可以在MPC中播放,但不能在我测试过的任何Web浏览器中播放.它也没有任何持续时间(MPC持续显示00:00).


Output MP4 file generated by my C++ application can be played i.e. in MPC, but it doesn't play in any web browser I tested it with. It also doesn't have any duration (MPC keeps showing 00:00).

为了比较我从上述C ++应用程序获得的输出MP4文件,我还使用FFMPEG从具有原始H264流的同一源文件创建片段MP4文件.我使用了以下命令:

To compare output MP4 file I got from my C++ application described above, I also used FFMPEG to create fragmented MP4 file from the same source file with raw H264 stream. I used the following command:

ffmpeg -r 60 -i input.h264 -c:v copy -f mp4 -movflags empty_moov+default_base_moof+frag_keyframe test.mp4

我用于测试的每个Web浏览器均可正确播放FFMPEG生成的此文件.它还具有正确的持续时间(但它也具有尾原子,无论如何我的实时流中都不会出现尾原子,并且由于我需要实时流,因此它一开始就没有固定的持续时间).

This file generated by FFMPEG is played correctly by every web browser I used for tests. It also has correct duration (but also it has trailing atom, which wouldn't be present in my live stream anyway, and as I need a live stream, it won't have any fixed duration in the first place).

MP4原子看起来非常相似(可以肯定,它们具有相同的avcc部分).有趣的是(但不确定是否重要),两个文件的NAL格式都与输入文件不同(RPI摄像机生成附件B格式的视频流,而输出的MP4文件则包含AVCC格式的NAL……至少是这样)当我将mdat原子与输入的H264数据进行比较时,情况就是这样.

MP4 atoms for both files look very similiar (they have identical avcc section for sure). What's interesting (but not sure if it's of any importance), both files have different NALs format than input file (RPI camera produces video stream in Annex-B format, while output MP4 files contain NALs in AVCC format... or at least it looks like it's the case when I compare mdat atoms with input H264 data).

我假设我需要为avcodec设置一些字段(或几个字段),以使其产生可被浏览器播放器正确解码和播放的视频流.但是我需要设置哪些字段?也许问题出在其他地方?我没办法了.

I assume there is some field (or a few fields) I need to set for avcodec to make it produce video stream that would be properly decoded and played by browsers players. But what field(s) do I need to set? Or maybe problem lies somewhere else? I ran out of ideas.

如建议的那样,我使用十六进制编辑器调查了两个MP4文件(由我的应用和FFMPEG工具生成)的二进制内容.我可以确认的内容:

EDIT 1: As suggested, I investigated binary content of both MP4 files (generated by my app and FFMPEG tool) with hex editor. What I can confirm:

  • 两个文件都具有相同的avcc部分(它们完全匹配且为AVCC格式,我逐字节分析了它,对此没有任何错误)
  • 两个文件均具有AVCC格式的NAL(我仔细查看了mdat原子,两个MP4文件之间没有区别)

因此,我想我的代码中创建额外数据没有任何问题-即使我仅使用SPS和PPS NAL进行输入,avcodec也会正确处理它.它会自动转换它们,因此不需要我手动进行.尽管如此,我原来的问题仍然存在.

So I guess there's nothing wrong with the extradata creation in my code - avcodec takes care of it properly, even if I just feed it with SPS and PPS NALs. It converts them by itself, so no need for me to do it by hand. Still, my original problem remains.

我取得了部分成功-我的应用程序生成的MP4现在可以在Firefox中播放.我将此行添加到了代码中(以及其余的流初始化):

EDIT 2: I achieved partial success - MP4 generated by my app now plays in Firefox. I added this line to the code (along with rest of stream initialization):

videoStream->codec->time_base = videoStream->time_base;

所以现在我的代码这一部分看起来像这样:

So now this section of my code looks like this:

//Set video stream data
videoStream->id = formatCtxt->nb_streams - 1;
videoStream->codec->width = 1280;
videoStream->codec->height = 720;
videoStream->time_base.den = 60; //FPS
videoStream->time_base.num = 1;
videoStream->codec->time_base = videoStream->time_base;
videoStream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
formatCtxt->pb = ioCtxt;

推荐答案

我终于找到了解决方案.我的MP4现在可以在Chrome中播放(同时仍可以在其他经过测试的浏览器中播放).

I finally found the solution. My MP4 now plays in Chrome (while still playing in other tested browsers).

在Chrome中, chrome://media-internals/显示(某种)MSE日志.当我看着那里时,我发现了一些针对我的测试玩家的警告:

In Chrome chrome://media-internals/ shows MSE logs (of a sort). When I looked there, I found a few of following warnings for my test player:

ISO-BMFF container metadata for video frame indicates that the frame is not a keyframe, but the video frame contents indicate the opposite.

这让我开始思考,并鼓励为具有关键帧的数据包设置AV_PKT_FLAG_KEY.我在填充AVPacket结构的部分中添加了以下代码:

That made me think and encouraged to set AV_PKT_FLAG_KEY for packets with keyframes. I added following code to section with filling AVPacket structure:

    //Check if keyframe field needs to be set
    int allowedNalsCount = 3; //In one packet there would be at most three NALs: SPS, PPS and video frame
    packet.flags = 0;
    for(int i = 0; i < frameSize && allowedNalsCount > 0; ++i)
    {
        uint32_t *curr =  (uint32_t*)(frameBuffer + i);
        if(*curr == synchMarker)
        {
            uint8_t nalType = frameBuffer[i + sizeof(uint32_t)] & 0x1F;
            if(nalType == KEYFRAME)
            {
                std::cout << "Keyframe detected at frame nr " << framesTotal << std::endl;
                packet.flags = AV_PKT_FLAG_KEY;
                break;
            }
            else
                i += sizeof(uint32_t) + 1; //We parsed this already, no point in doing it again

            --allowedNalsCount;
        }
    }

在我的情况下(切片IDR),KEYFRAME常量为0x5.

A KEYFRAME constant turns out to be 0x5 in my case (Slice IDR).

这篇关于MP4碎片-在浏览器中播放时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆