将ffmpeg转码结果流传输到S3 [英] Stream ffmpeg transcoding result to S3

查看:147
本文介绍了将ffmpeg转码结果流传输到S3的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用FFMPEG对大文件进行转码,并将结果直接存储在AWS S3上.这将在tmp空间有限的AWS Lambda内部完成,因此我无法在本地存储转码结果,然后在第二步中将其上传到S3.我将没有足够的tmp空间.因此,我想将FFMPEG输出直接存储在S3上.

I want to transcode a large file using FFMPEG and store the result directly on AWS S3. This will be done inside of an AWS Lambda that has limited tmp space so I can't store the transcoding result locally and then upload it to S3 in a second step. I won't have enough tmp space. I therefore want to store the FFMPEG output directly on S3.

因此,我创建了一个允许使用"PUT"的S3预签名网址:

I therefore created a S3 pre-signed url that allows 'PUT':

var outputPath = s3Client.GetPreSignedURL(new Amazon.S3.Model.GetPreSignedUrlRequest
{
    BucketName = "my-bucket",
    Expires = DateTime.UtcNow.AddMinutes(5),
    Key = "output.mp3",
    Verb = HttpVerb.PUT,
});

然后我用生成的预签名网址调用ffmpeg:

I then called ffmpeg with the resulting pre-signed url:

ffmpeg -i C:\input.wav -y -vn -ar 44100 -ac 2 -ab 192k -f mp3 https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550427237&Signature=%2BE8Wc%2F%2FQYrvGxzc%2FgXnsvauKnac%3D

FFMPEG返回退出代码1,并显示以下输出:

FFMPEG returns an exit code of 1 with the following output:

ffmpeg version N-93120-ga84af760b8 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20190212
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 26.100 / 56. 26.100
  libavcodec     58. 47.100 / 58. 47.100
  libavformat    58. 26.101 / 58. 26.101
  libavdevice    58.  6.101 / 58.  6.101
  libavfilter     7. 48.100 /  7. 48.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'C:\input.wav':
  Duration: 00:04:16.72, bitrate: 3072 kb/s
    Stream #0:0: Audio: pcm_s32le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 3072 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s32le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Output #0, mp3, to 'https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550427237&Signature=%2BE8Wc%2F%2FQYrvGxzc%2FgXnsvauKnac%3D':
  Metadata:
    TSSE            : Lavf58.26.101
    Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s32p, 192 kb/s
    Metadata:
      encoder         : Lavc58.47.100 libmp3lame
size=     577kB time=00:00:24.58 bitrate= 192.2kbits/s speed=49.1x    
size=    1109kB time=00:00:47.28 bitrate= 192.1kbits/s speed=47.2x    
[tls @ 000001d73d786b00] Error in the push function.
av_interleaved_write_frame(): I/O error
Error writing trailer of https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550427237&Signature=%2BE8Wc%2F%2FQYrvGxzc%2FgXnsvauKnac%3D: I/O error
size=    1143kB time=00:00:48.77 bitrate= 192.0kbits/s speed=  47x    
video:0kB audio:1144kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[tls @ 000001d73d786b00] The specified session has been invalidated for some reason.
[tls @ 000001d73d786b00] Error in the pull function.
[https @ 000001d73d784fc0] URL read error:  -5
Conversion failed!

如您所见,我有一个URL read error.这让我有些惊讶,因为我想输出到该URL而不是阅读它.

As you can see, I have a URL read error. This is a little surprising to me since I want to output to this url and not read it.

有人知道如何直接将FFMPEG输出直接存储到S3,而不必先将其存储在本地吗?

Anybody know how I can store directly my FFMPEG output directly to S3 without having to store it locally first?

编辑1 然后,我尝试使用-method PUT参数,并使用http而不是https从等式中删除TLS.这是使用-v trace选项运行ffmpeg时获得的输出.

Edit 1 I then tried to use the -method PUT parameter and use http instead of https to remove TLS from the equation. Here's the output that I got when running ffmpeg with the -v trace option.

ffmpeg version N-93120-ga84af760b8 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8.2.1 (GCC) 20190212
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 26.100 / 56. 26.100
  libavcodec     58. 47.100 / 58. 47.100
  libavformat    58. 26.101 / 58. 26.101
  libavdevice    58.  6.101 / 58.  6.101
  libavfilter     7. 48.100 /  7. 48.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
Splitting the commandline.
Reading option '-i' ... matched as input url with argument 'C:\input.wav'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '-vn' ... matched as option 'vn' (disable video) with argument '1'.
Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz)) with argument '44100'.
Reading option '-ac' ... matched as option 'ac' (set number of audio channels) with argument '2'.
Reading option '-ab' ... matched as option 'ab' (audio bitrate (please use -b:a)) with argument '192k'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'mp3'.
Reading option '-method' ... matched as AVOption 'method' with argument 'PUT'.
Reading option '-v' ... matched as option 'v' (set logging level) with argument 'trace'.
Reading option 'https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option y (overwrite output files) with argument 1.
Applying option v (set logging level) with argument trace.
Successfully parsed a group of options.
Parsing a group of options: input url C:\input.wav.
Successfully parsed a group of options.
Opening an input file: C:\input.wav.
[NULL @ 000001fb37abb180] Opening 'C:\input.wav' for reading
[file @ 000001fb37abc180] Setting default whitelist 'file,crypto'
Probing wav score:99 size:2048
[wav @ 000001fb37abb180] Format wav probed with size=2048 and score=99
[wav @ 000001fb37abb180] Before avformat_find_stream_info() pos: 54 bytes read:65590 seeks:1 nb_streams:1
[wav @ 000001fb37abb180] parser not found for codec pcm_s32le, packets or times may be invalid.
    Last message repeated 1 times
[wav @ 000001fb37abb180] All info found
[wav @ 000001fb37abb180] stream 0: start_time: -192153584101141.156 duration: 256.716
[wav @ 000001fb37abb180] format: start_time: -9223372036854.775 duration: 256.716 bitrate=3072 kb/s
[wav @ 000001fb37abb180] After avformat_find_stream_info() pos: 204854 bytes read:294966 seeks:1 frames:50
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'C:\input.wav':
  Duration: 00:04:16.72, bitrate: 3072 kb/s
    Stream #0:0, 50, 1/48000: Audio: pcm_s32le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 3072 kb/s
Successfully opened the file.
Parsing a group of options: output url https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D.
Applying option vn (disable video) with argument 1.
Applying option ar (set audio sampling rate (in Hz)) with argument 44100.
Applying option ac (set number of audio channels) with argument 2.
Applying option ab (audio bitrate (please use -b:a)) with argument 192k.
Applying option f (force format) with argument mp3.
Successfully parsed a group of options.
Opening an output file: https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D.
[http @ 000001fb37b15140] Setting default whitelist 'http,https,tls,rtp,tcp,udp,crypto,httpproxy'
[tcp @ 000001fb37b16c80] Original list of addresses:
[tcp @ 000001fb37b16c80] Address 52.216.8.203 port 80
[tcp @ 000001fb37b16c80] Interleaved list of addresses:
[tcp @ 000001fb37b16c80] Address 52.216.8.203 port 80
[tcp @ 000001fb37b16c80] Starting connection attempt to 52.216.8.203 port 80
[tcp @ 000001fb37b16c80] Successfully connected to 52.216.8.203 port 80
[http @ 000001fb37b15140] request: PUT /output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D HTTP/1.1
Transfer-Encoding: chunked
User-Agent: Lavf/58.26.101
Accept: */*
Connection: close
Host: landr-distribution-reportsdev-mb.s3.amazonaws.com
Icy-MetaData: 1
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s32le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 8 logical cores
[graph_0_in_0_0 @ 000001fb37b21080] Setting 'time_base' to value '1/48000'
[graph_0_in_0_0 @ 000001fb37b21080] Setting 'sample_rate' to value '48000'
[graph_0_in_0_0 @ 000001fb37b21080] Setting 'sample_fmt' to value 's32'
[graph_0_in_0_0 @ 000001fb37b21080] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 000001fb37b21080] tb:1/48000 samplefmt:s32 samplerate:48000 chlayout:0x3
[format_out_0_0 @ 000001fb37b22cc0] Setting 'sample_fmts' to value 's32p|fltp|s16p'
[format_out_0_0 @ 000001fb37b22cc0] Setting 'sample_rates' to value '44100'
[format_out_0_0 @ 000001fb37b22cc0] Setting 'channel_layouts' to value '0x3'
[format_out_0_0 @ 000001fb37b22cc0] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 000001fb37b0d940] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto_resampler_0 @ 000001fb37b251c0] picking s32p out of 3 ref:s32
[auto_resampler_0 @ 000001fb37b251c0] [SWR @ 000001fb37b252c0] Using fltp internally between filters
[auto_resampler_0 @ 000001fb37b251c0] ch:2 chl:stereo fmt:s32 r:48000Hz -> ch:2 chl:stereo fmt:s32p r:44100Hz
Output #0, mp3, to 'https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D':
  Metadata:
    TSSE            : Lavf58.26.101
    Stream #0:0, 0, 1/44100: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s32p, delay 1105, 192 kb/s
    Metadata:
      encoder         : Lavc58.47.100 libmp3lame
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
    Last message repeated 6 times
size=     649kB time=00:00:27.66 bitrate= 192.2kbits/s speed=55.3x    
size=    1207kB time=00:00:51.48 bitrate= 192.1kbits/s speed=51.5x    
av_interleaved_write_frame(): Unknown error
No more output streams to write to, finishing.
[libmp3lame @ 000001fb37b147c0] Trying to remove 47 more samples than there are in the queue
Error writing trailer of https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D: Error number -10054 occurred
size=    1251kB time=00:00:53.39 bitrate= 192.0kbits/s speed=51.5x    
video:0kB audio:1252kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (C:\input.wav):
  Input stream #0:0 (audio): 5014 packets read (20537344 bytes); 5014 frames decoded (2567168 samples); 
  Total: 5014 packets (20537344 bytes) demuxed
Output file #0 (https://my-bucket.s3.amazonaws.com/output.mp3?AWSAccessKeyId=AKIAJDSGJWM63VQEXHIQ&Expires=1550695990&Signature=dy3RVqDlX%2BlJ0INlDkl0Lm1Rqb4%3D):
  Output stream #0:0 (audio): 2047 frames encoded (2358144 samples); 2045 packets muxed (1282089 bytes); 
  Total: 2045 packets (1282089 bytes) muxed
5014 frames successfully decoded, 0 decoding errors
[AVIOContext @ 000001fb37b1f440] Statistics: 0 seeks, 2046 writeouts
[http @ 000001fb37b15140] URL read error:  -10054
[AVIOContext @ 000001fb37ac4400] Statistics: 20611126 bytes read, 1 seeks
Conversion failed!

看起来它能够连接到我的S3预签名网址,但我仍然出现Error writing trailer错误和URL read error的错误.

So it looks like it is able to connect to my S3 pre-signed url but I still have the Error writing trailer error coupled with a URL read error.

推荐答案

由于目标是从S3获取字节流并将其也输出到S3,因此不必使用ffmpeg的HTTP功能. ffmpeg被构建为命令行工具,可以将其从stdin的输入输出到stdout/stderr,使用这些功能比尝试让ffmpeg处理HTTP读/写更为简单.您只需要将HTTP流(从S3读取)连接到ffmpegs的stdin,并将其stdout连接到另一个流(写入S3).有关ffmpeg管道的更多信息,请参见此处.

Since the goal is to take a stream of bytes from S3 and output it also to S3, it is not necessary to use the HTTP capabilities of ffmpeg. ffmpeg being built as a command line tool that can take it's input from stdin and output to stdout/stderr, it is more simple to use these capabilities than to try to have ffmpeg handle the HTTP reading/writing. You just have to connect an HTTP stream (that reads from S3) to ffmpegs' stdin and connect its stdout to another stream (that writes to S3). See here for more information on ffmpeg piping.

最简单的实现如下所示:

The most simple implementation would look like this:

var s3Client = new AmazonS3Client(RegionEndpoint.USEast1);

var startInfo = new ProcessStartInfo
{
    FileName = "ffmpeg",
    Arguments = $"-i pipe:0 -y -vn -ar 44100 -ab 192k -f mp3 pipe:1",
    CreateNoWindow = true,
    RedirectStandardInput = false,
    RedirectStandardOutput = false,
    UseShellExecute = false,
    RedirectStandardInput = true,
    RedirectStandardOutput = true,
};

using (var process = new Process { StartInfo = startInfo })
{
    // Get a stream to an object stored on S3.
    var s3InputObject = await s3Client.GetObjectAsync(new GetObjectRequest
    {
        BucketName = "my-bucket",
        Key = "input.wav",
    });

    process.Start();

    // Store the output of ffmpeg directly on S3 in a background thread
    // since I don't 'await'.
    var uploadTask = s3Client.PutObjectAsync(new PutObjectRequest
    {
        BucketName = "my-bucket",
        Key = "output.wav",
        InputStream = process.StandardOutput.BaseStream,
    });

    // Feed the S3 input stream into ffmpeg
    await s3Object.ResponseStream.CopyToAsync(process.StandardInput.BaseStream);
    process.StandardInput.Close();

    // Wait for ffmpeg to be done
    await uploadTask;

    process.WaitForExit();
}

此代码段提供了有关如何通过管道传输ffmpeg的输入/输出的想法.

This snippet gives an idea of how to pipe the input/output of ffmpeg.

不幸的是,此代码不起作用..对PutObjectAsync的调用将引发一个异常,提示Could not determine content length.是的,没错,S3仅允许上传已知大小的文件,我们不能使用PutObjectAsync,因为我们不知道ffmpeg的输出将是多少.

Unfortunately, this code does not work. The call to PutObjectAsync will throw an exception that says Could not determine content length. Yes, that's true, S3 only allows upload of files of known sizes, we can't use PutObjectAsync since we don't know how big will be the output of ffmpeg.

解决此问题的方法是使用S3分段上传.因此,您无需将ffmpeg直接直接馈送到S3,而是将其写在不太大的内存缓冲区(比如说25 MB)中(这样它就不会消耗将运行此代码的AWS lambda的所有内存) .当缓冲区已满时,可以使用分段上传将缓冲区上传到S3.然后,一旦ffmpeg完成对输入文件的转码,您就将当前内存缓冲区中剩余的内容上传到S3,然后只需调用

The idea to workaround this is to use S3 multipart upload. So instead of directly feeding the ffmpeg directly to S3, you write it in a memory buffer (let's say 25 MB) that is not too big (so that it won't consume all the memory of the AWS lambda that will run this code). When the buffer is full, you upload the buffer to S3 using a multipart upload. Then, once ffmpeg is done transcoding the input file, you take what's left in the current memory buffer, upload this last buffer to S3 and then simply call CompleteMultipartUpload. This will take all the 25MB parts and merge them in a single file.

就是这样.通过这种策略,可以从S3读取文件,对其进行转码并将其即时存储在S3中,而无需在本地存储任何内容.因此,可以在占用内存极少且几乎没有磁盘空间的AWS lambda中对大型文件进行转码.

That's it. With this strategy it is possible to read a file from S3, transcode it and store it on the fly in S3 without storing anything locally. It is therefore possible to transcode large files in an AWS lambda that uses a very minimal quantity of memory and virtually no disk space.

这已成功实现.我将尝试看看是否可以共享此代码.

This was implemented successfully. I will try to see if this code can be shared.

警告:如评论中所述,如果我们传输ffmpeg的输出或让ffmpeg将自己写入本地文件,则得到的结果并非100%相同.写入本地文件时,ffmpeg可以在完成转码后回溯到文件的开头.然后,它可以使用转码的一些结果来更新文件元数据.我不知道没有此更新的元数据会有什么影响.

Warning: as mentioned in a comment, the result that we get is not 100% identical if we stream the output of ffmpeg or if we let ffmpeg write himself to a local file. When writing to a local file, ffmpeg has the ability to seek back to the beginning of the file when it is done transcoding. It can then update the file metadata with some results of the transcoding. I don't know what's the impact of not having this updated metadata.

这篇关于将ffmpeg转码结果流传输到S3的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆