将文件从URL传输到Cloud Storage [英] Transfer file from URL to Cloud Storage

查看:163
本文介绍了将文件从URL传输到Cloud Storage的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名Ruby开发人员,正在尝试用Python编写的Google Cloud Functions,并且碰壁了将远程文件从给定URL传输到Google Cloud Storage(GCS)的过程.

I'm a Ruby dev trying my hand at Google Cloud Functions written in Python and have hit a wall with transferring a remote file from a given URL to Google Cloud Storage (GCS).

在同等的RoR应用程序中,我下载到该应用程序的临时存储,然后上传到GSC.

In an equivalent RoR app I download to the app's ephemeral storage and then upload to GSC.

我希望可以通过Cloud Function将远程文件简单地下载"到我的GCS存储桶中.

I am hoping there's a way to simply 'download' the remote file to my GCS bucket via the Cloud Function.

这是我正在处理一些注释的简化示例,真正的代码从私有API提取URL,但这工作正常,并且不是问题所在.

Here's a simplified example of what I am doing with some comments, the real code fetches the URLs from a private API, but that works fine and isn't where the issue is.

from google.cloud import storage
project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

# This works fine
#source_file_name = 'localfile.txt'

# When using a remote URL I get 'IOError: [Errno 2] No such file or directory'
source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(source_file_name)

upload_blob(bucket_name, source_file_name, destination_blob_name)

谢谢.

推荐答案

无法直接从URL将文件上传到Google Cloud Storage.由于您是从本地环境运行脚本,因此要上载的文件内容必须在同一环境中.这意味着url的内容需要存储在内存中或文件中.

It is not possible to upload a file to Google Cloud Storage directly from an URL. Since you are running the script from a local environment, the file contents that you want to upload, need to be in that same environment. This means that the contents of the url need to either be stored in the memory, or in a file.

根据您的代码显示如何执行此操作的示例:

An example showing how to do it, based in your code:

选项1 :您可以使用wget模块,该模块将获取url并将其内容下载到本地文件中(类似于wget CLI命令).请注意,这意味着文件将存储在本地,然后从文件上传.上传完成后,我添加了os.remove行以删除文件.

Option 1: You can use the wget module, that will fetch the url and download it's contents into a local file (similar to the wget CLI command). Note that this means that the file will be stored locally, and then uploaded from the file. I added the os.remove line to remove the file once the upload is done.

from google.cloud import storage
import wget
import io, os

project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):   
    filename = wget.download(source_file_name)

    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(filename, content_type='image/jpg')
    os.remove(filename)

upload_blob(bucket_name, source_file_name, destination_blob_name)

选项2 :使用urllib模块,其工作方式与wget模块类似,但不是写入文件,而是写入变量.请注意,我在Python3中做了此示例,如果存在某些差异您打算在Python 2.X中运行脚本.

Option 2: using the urllib module, works similar to the wget module, but instead of writing into a file it writes to a variable. Note that I did this example im Python3, there are some differences if you plan to run your script in Python 2.X.

from google.cloud import storage
import urllib.request

project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):   
    file = urllib.request.urlopen(source_file_name)

    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_string(link.read(), content_type='image/jpg')

upload_blob(bucket_name, source_file_name, destination_blob_name)

这篇关于将文件从URL传输到Cloud Storage的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆