如何在不使用python写入文件的情况下将文件分块传输到Azure Blob存储 [英] how to transfer file to azure blob storage in chunks without writing to file using python

查看:131
本文介绍了如何在不使用python写入文件的情况下将文件分块传输到Azure Blob存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将文件从Google云存储传输到Azure Blob存储.

I need to transfer files from google cloud storage to azure blob storage.

Google提供了一个代码段,用于将文件下载到byte变量,如下所示:

Google gives a code snippet to download files to byte variable like so:

# Get Payload Data
req = client.objects().get_media(
        bucket=bucket_name,
        object=object_name,
        generation=generation)    # optional
# The BytesIO object may be replaced with any io.Base instance.
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, req, chunksize=1024*1024)
done = False
while not done:
    status, done = downloader.next_chunk()
    if status:
        print 'Download %d%%.' % int(status.progress() * 100)
    print 'Download Complete!'
print fh.getvalue()

我能够通过更改fh对象类型来将其修改为存储到文件中:

I was able to modify this to store to file by changing the fh object type like so:

fh = open(object_name, 'wb')

然后我可以使用blob_service.put_block_blob_from_path上传到azure blob存储.

Then I can upload to azure blob storage using blob_service.put_block_blob_from_path.

我要避免在执行传输的机器上写入本地文件.

I want to avoid writing to local file on machine doing the transfer.

我收集了Google的代码片段,一次将数据加载到io.BytesIO()对象中.我认为我可能应该使用它来一次写入blob存储块.

I gather Google's snippet loads the data into the io.BytesIO() object a chunk at a time. I reckon I should probably use this to write to blob storage a chunk at a time.

我尝试将整个内容读取到内存中,然后使用put_block_blob_from_bytes上传,但是出现内存错误(文件可能太大(〜600MB).

I experimented with reading the whole thing into memory, and then uploading using put_block_blob_from_bytes, but I got a memory error (file is probably too big (~600MB).

有什么建议吗?

推荐答案

根据

According to the source codes of blobservice.py for Azure Storage and BlobReader for Google Cloud Storage, you can try to use the Azure function blobservice.put_block_blob_from_file to write the stream from the GCS class blobreader has the function read as stream, please see below.

因此,请参考 https://cloud.google中的代码. com/appengine/docs/python/blobstore/#Python_Using_BlobReader ,您可以尝试如下操作.

So refering to the code from https://cloud.google.com/appengine/docs/python/blobstore/#Python_Using_BlobReader, you can try to do this as below.

from google.appengine.ext import blobstore
from azure.storage.blob import BlobService

blob_key = ...
blob_reader = blobstore.BlobReader(blob_key)

blob_service = BlobService(account_name, account_key)
container_name = ...
blob_name = ...
blobservice.put_block_blob_from_file(container_name, blob_name, blob_reader)

这篇关于如何在不使用python写入文件的情况下将文件分块传输到Azure Blob存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆