使用Boto上传到s3时如何gzip [英] How to gzip while uploading into s3 using boto

查看:82
本文介绍了使用Boto上传到s3时如何gzip的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的本地文件.我想使用 boto 库将该文件的压缩版本上传到S3.该文件太大,无法在上传之前将其有效地gzip压缩到磁盘上,因此在上传过程中应以流方式将其压缩.

I have a large local file. I want to upload a gzipped version of that file into S3 using the boto library. The file is too large to gzip it efficiently on disk prior to uploading, so it should be gzipped in a streamed way during the upload.

boto 库知道一个函数 set_contents_from_file(),该函数希望从中读取类似文件的对象.

The boto library knows a function set_contents_from_file() which expects a file-like object it will read from.

gzip 库知道 GzipFile 类,该类可以通过名为 fileobj 的参数获取对象;它将在压缩时写入此对象.

The gzip library knows the class GzipFile which can get an object via the parameter named fileobj; it will write to this object when compressing.

我想将这两个功能结合起来,但是一个API想要自己读取,另一个API想要自己编写;都不知道被动操作(例如被写入或从中读取).

I'd like to combine these two functions, but the one API wants to read by itself, the other API wants to write by itself; neither knows a passive operation (like being written to or being read from).

有人对如何以一种可行的方式将它们结合起来有想法吗?

Does anybody have an idea on how to combine these in a working fashion?

我接受了一个答案(请参阅下文),因为它提示了我要去的地方,但是,如果您遇到相同的问题,您可能会发现我自己的答案(也在下面)更有用,因为我使用多部分实现了一个解决方案在其中上传.

I accepted one answer (see below) because it hinted me on where to go, but if you have the same problem, you might find my own answer (also below) more helpful, because I implemented a solution using multipart uploads in it.

推荐答案

确实没有办法做到这一点,因为S3不支持真正的流输入(即分块传输编码).上传之前,您必须了解Content-Length,唯一的了解方法是首先执行gzip操作.

There really isn't a way to do this because S3 doesn't support true streaming input (i.e. chunked transfer encoding). You must know the Content-Length prior to upload and the only way to know that is to have performed the gzip operation first.

这篇关于使用Boto上传到s3时如何gzip的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆