Boto3:等待S3流媒体上传完成 [英] Boto3: Wait for S3 streaming upload to complete

查看:134
本文介绍了Boto3:等待S3流媒体上传完成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将S3.Client.upload_fileobj()BytesIO流用作输入,以将文件从流上传到S3.我的函数在上传完成之前不应该返回,因此我需要一种等待它的方法.

I'm using S3.Client.upload_fileobj() with a BytesIO stream as input to upload a file to S3 from a stream. My function should not return before the upload is finished, so I need a way to wait it.

从文档中没有明显的方法可以等待传输完成,但是有一些提示提示可以起作用:

From the documentation there is no obvious way to wait for the transfer to finish, but there are some hints of what could work:

  1. 使用回调参数等待进度达到100%.在Javascript中,使用回调或Promise可能很简单,但是在Python中,我不确定.
  2. 使用 S3.Waiter对象检查对象是否存在.但这是通过每5秒进行一次轮询来完成的,而且效果很差.另外我不确定是否会等到对象完成.
  3. 有一个带有.complete()方法的类 S3.MultipartUpload ,但是我怀疑这样做是否符合我的要求.
  4. 执行循环,以检查对象是否已完全上传,如果没有上传,则睡一会儿.但是,如何检查对象是否完整?
  1. Use the callback arg to wait until progress is at 100%. In Javascript this would be trivial using callbacks or promises, but in Python I'm not so sure.
  2. Use a S3.Waiter object that checks if the object exists. But it does so by polling every 5s and seems very ineffective. Also I'm not sure if it would wait until the object is complete.
  3. There's a class S3.MultipartUpload with a .complete() method, but I doubt that does what I want.
  4. Do a loop that checks if the object is completely uploaded and if not, sleeps for a bit. But how do I check if the object is complete?

我一直在搜索,但似乎没有人问同样的问题.另外,大多数谈论相关问题的结果都使用了不同的API(我相信upload_fileobj()是相当新的).

I've been googling but it seems nobody is asking the same question. Also, most results talking about related issues are using a different API (I believe upload_fileobj() is rather new).

编辑 如果发现有关 S3.Client.put_object ,它也接受类似文件的对象并阻塞直到服务器响应为止.但这可以与流结合使用吗?我不确定Python多线程在这里如何工作.该流最初来自S3.Client.download_fileobj(),通过subprocess.Popen()进行管道传输,然后应被上传回S3.据我所知,下载和子进程都在并行线程/进程中运行.

EDIT If found out about S3.Client.put_object which also accepts a file-like object and blocks until the server responded. But would that work in combination with streams? I'm not sure how Python multithreading works here. The stream comes originally from a S3.Client.download_fileobj(), gets piped through a subprocess.Popen() and is then supposed to get uploaded back to S3. Both the download and the subprocess run in parallel threads/processes as fas as I can tell.

推荐答案

upload_file/upload_fileobj方法会照顾您要查找的内容(即,它们等待对象/文件上载的完成).

upload_file/upload_fileobj methods take care of the things you're looking for (i.e they wait for completion of object/file uploading).

我不建议第一或第四选择.也不需要使用s3服务员,因为upload_file/upload_fileobj方法仅在完成上传作业后才返回.

I don't suggest 1st or 4th options. There's no need to use s3 waiter either, as upload_file/upload_fileobj methods returns only after uploading job is done.

请注意,upload_file/upload_fileobj方法将自动处理读取/写入文件,以及针对大型文件并行执行分段上传,因此无论文件大小如何,都无需使用分段上传.

Note that upload_file/upload_fileobj methods will automatically handle reading/writing files as well as doing multipart uploads in parallel for large files so there's no need to use multipart upload irrespective of file size.

这篇关于Boto3:等待S3流媒体上传完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆