如何将botocore.response.StreamingBody用作stdin PIPE [英] How to use botocore.response.StreamingBody as stdin PIPE

查看:278
本文介绍了如何将botocore.response.StreamingBody用作stdin PIPE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将来自AWS S3的大型视频文件通过管道传输到Popenstdin,从Python的角度来看,这是一个文件状对象".此代码作为AWS Lambda函数运行,因此这些文件将不适合内存或本地文件系统.另外,我不想将这些巨大的文件复制到任何地方,我只想流输入,实时处理并流输出.我已经使处理和流输出位起作用了.问题是如何获取输入流作为Popen pipe.

I want to pipe large video files from AWS S3 into Popen's stdin, which is from Python's point of view a 'file-like object'. This code runs as an AWS Lambda function, so these files won't fit in memory or on the local file system. Also, I don't want to copy these huge files anywhere, I just want to stream the input, process on the fly, and stream the output. I've already got the processing and streaming output bits working. The problem is how to obtain an input stream as a Popen pipe.

更新:我整理了一个简短程序,该程序基于以下内容调用StreamingBody.read(amt = chunk_size)一条评论.该程序读取了一些输入文件(一个mp4视频)并被卡住了,这可能是因为数据的使用者(ffmpeg)并未真正运行,或者它的STDIN缓冲区已满并且整个混乱都停止了吗?

Update: I put together a short program that invokes StreamingBody.read(amt=chunk_size) based on a comment. The program reads some of the input file (an mp4 video) and gets stuck, possibly because the consumer of the data (ffmpeg) does not actually run, or maybe its STDIN buffer fills and the whole mess grinds to a halt?

我可以访问S3存储桶中的文件:

I can access a file in an S3 bucket:

import boto3
s3 = boto3.resource('s3')
response = s3.Object(bucket_name=bucket, key=key).get()
body = response['Body']  

body是一个botocore.response.StreamingBody,看起来像这样:

body is a botocore.response.StreamingBody which looks like this:

{ u'Body': <botocore.response.StreamingBody object at 0x00000000042EDAC8>, u'AcceptRanges': 'bytes', u'ContentType': 'video/mp4', 'ResponseMetadata': { 'HTTPStatusCode': 200, 'HostId': 'aAUs3IdkXP6vPGwauv6/USEBUWfxxVeueNnQVAm4odTkPABKUx1EbZO/iLcrBWb+ZiyqmQln4XU=', 'RequestId': '6B306488F6DFEEE9' }, u'LastModified': datetime.datetime(2015, 3, 1, 1, 32, 58, tzinfo=tzutc()), u'ContentLength': 393476644, u'ETag': '"71079d637e9f14a152170efdf73df679"', u'Metadata': {'cb-modifiedtime': 'Sun, 01 Mar 2015 01:27:52 GMT'}}

{ u'Body': <botocore.response.StreamingBody object at 0x00000000042EDAC8>, u'AcceptRanges': 'bytes', u'ContentType': 'video/mp4', 'ResponseMetadata': { 'HTTPStatusCode': 200, 'HostId': 'aAUs3IdkXP6vPGwauv6/USEBUWfxxVeueNnQVAm4odTkPABKUx1EbZO/iLcrBWb+ZiyqmQln4XU=', 'RequestId': '6B306488F6DFEEE9' }, u'LastModified': datetime.datetime(2015, 3, 1, 1, 32, 58, tzinfo=tzutc()), u'ContentLength': 393476644, u'ETag': '"71079d637e9f14a152170efdf73df679"', u'Metadata': {'cb-modifiedtime': 'Sun, 01 Mar 2015 01:27:52 GMT'}}

我打算使用body这样的东西:

I intend to use body something like this:

from subprocess import Popen, PIPE
Popen(cmd, stdin=PIPE, stdout=PIPE).communicate(input=body)[0]

但是当然body需要转换为类似文件的对象.问题是如何?

But of course body needs to be converted into a file-like object. The question is how?

推荐答案

要从StreamingBody读取二进制数据,请使用StreamBody.read().您会得到一个二进制字符串.

For reading binary data from StreamingBody use StreamBody.read(). You get a binary string.

这篇关于如何将botocore.response.StreamingBody用作stdin PIPE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆