使用 Python 从 Google 云存储下载多个文件 [英] Download multiple file from Google cloud storage using Python

查看:13
本文介绍了使用 Python 从 Google 云存储下载多个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 Google 云存储文件夹下载多个文件.我可以下载单个文件,但无法下载多个文件.我从 此链接 但似乎不起作用.代码如下:

I am trying to download multiple files from the Google cloud storage folder. I am able to download the single file but unable to download multiple files. I took this reference from this link but seems it is not working. The code is as follow:

# [download multiple files]
bucket_name = 'bigquery-hive-load'
# The "folder" where the files you want to download are
folder="/projects/bigquery/download/shakespeare/"

# Create this folder locally
if not os.path.exists(folder):
    os.makedirs(folder)

# Retrieve all blobs with a prefix matching the folder
    bucket=storage_client.get_bucket(bucket_name)
    print(bucket)
    blobs=list(bucket.list_blobs(prefix=folder))
    print(blobs)
    for blob in blobs:
        if(not blob.name.endswith("/")):
            blob.download_to_filename(blob.name)

# [End download to multiple files]

有没有办法下载与模式(名称)或其他东西匹配的多个文件.由于我是从 bigquery 导出文件,因此文件名将如下所示:

Is there any way to download multiple files matching with the pattern(name) or something else. Since I am exporting the file from bigquery, the file names will be something like below:

shakespeare-000000000000.csv.gz
shakespeare-000000000001.csv.gz
shakespeare-000000000002.csv.gz
shakespeare-000000000003.csv.gz

参考:下载单个文件的工作代码:

Reference: Working code to download single file:

# [download to single files]

edgenode_destination_uri = '/projects/bigquery/download/shakespeare-000000000000.csv.gz'
bucket_name = 'bigquery-hive-load'
gcs_bucket = storage_client.get_bucket(bucket_name)
blob = gcs_bucket.blob("shakespeare.csv.gz")
blob.download_to_filename(edgenode_destination_uri)
logging.info('Downloded {} to {}'.format(
    gcs_bucket, edgenode_destination_uri))

# [end download to single files]

推荐答案

经过一番尝试,我解决了这个问题,也忍不住在这里发帖.

After some trial, I solved this and couldn't stop myself from posting here as well.

bucket_name = 'mybucket'
folder='/projects/bigquery/download/shakespeare/'
delimiter='/'
file = 'shakespeare'

# Retrieve all blobs with a prefix matching the file.
bucket=storage_client.get_bucket(bucket_name)
# List blobs iterate in folder 
blobs=bucket.list_blobs(prefix=file, delimiter=delimiter) # Excluding folder inside bucket
for blob in blobs:
   print(blob.name)
   destination_uri = '{}/{}'.format(folder, blob.name) 
   blob.download_to_filename(destination_uri)

这篇关于使用 Python 从 Google 云存储下载多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆