将文件从 URL 传输到 Cloud Storage [英] Transfer file from URL to Cloud Storage

查看:27
本文介绍了将文件从 URL 传输到 Cloud Storage的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名 Ruby 开发人员,正在尝试使用 Python 编写的 Google Cloud Functions,但在将远程文件从给定 URL 传输到 Google Cloud Storage (GCS) 方面遇到了困难.

I'm a Ruby dev trying my hand at Google Cloud Functions written in Python and have hit a wall with transferring a remote file from a given URL to Google Cloud Storage (GCS).

在等效的 RoR 应用中,我下载到应用的临时存储,然后上传到 GSC.

In an equivalent RoR app I download to the app's ephemeral storage and then upload to GSC.

我希望有一种方法可以通过 Cloud Function 简单地将远程文件下载"到我的 GCS 存储桶.

I am hoping there's a way to simply 'download' the remote file to my GCS bucket via the Cloud Function.

这是我正在做的一些注释的简化示例,真实代码从私有 API 获取 URL,但这工作正常并且不是问题所在.

Here's a simplified example of what I am doing with some comments, the real code fetches the URLs from a private API, but that works fine and isn't where the issue is.

from google.cloud import storage
project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

# This works fine
#source_file_name = 'localfile.txt'

# When using a remote URL I get 'IOError: [Errno 2] No such file or directory'
source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(source_file_name)

upload_blob(bucket_name, source_file_name, destination_blob_name)

提前致谢.

推荐答案

无法直接从 URL 将文件上传到 Google Cloud Storage.由于您是从本地环境运行脚本,因此您要上传的文件内容需要在同一环境中.这意味着 url 的内容需要存储在内存中,或者存储在文件中.

It is not possible to upload a file to Google Cloud Storage directly from an URL. Since you are running the script from a local environment, the file contents that you want to upload, need to be in that same environment. This means that the contents of the url need to either be stored in the memory, or in a file.

根据您的代码显示如何执行此操作的示例:

An example showing how to do it, based in your code:

选项 1:您可以使用 wget 模块,它将获取 url 并将其内容下载到本地文件中(类似于 wget代码> CLI 命令).请注意,这意味着文件将存储在本地,然后从文件上传.我添加了 os.remove 行以在上传完成后删除文件.

Option 1: You can use the wget module, that will fetch the url and download it's contents into a local file (similar to the wget CLI command). Note that this means that the file will be stored locally, and then uploaded from the file. I added the os.remove line to remove the file once the upload is done.

from google.cloud import storage
import wget
import io, os

project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):   
    filename = wget.download(source_file_name)

    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(filename, content_type='image/jpg')
    os.remove(filename)

upload_blob(bucket_name, source_file_name, destination_blob_name)

选项 2:使用 urllib 模块,其工作方式类似于 wget 模块,但不是写入文件,而是写入多变的.请注意,我在 Python3 中做了这个例子,有一些差异如果您打算在 Python 2.X 中运行您的脚本.

Option 2: using the urllib module, works similar to the wget module, but instead of writing into a file it writes to a variable. Note that I did this example im Python3, there are some differences if you plan to run your script in Python 2.X.

from google.cloud import storage
import urllib.request

project_id = 'my-project'
bucket_name = 'my-bucket'
destination_blob_name = 'upload.test'
storage_client = storage.Client.from_service_account_json('my_creds.json')

source_file_name = 'http://www.hospiceofmontezuma.org/wp-content/uploads/2017/10/confused-man.jpg'

def upload_blob(bucket_name, source_file_name, destination_blob_name):   
    file = urllib.request.urlopen(source_file_name)

    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_string(link.read(), content_type='image/jpg')

upload_blob(bucket_name, source_file_name, destination_blob_name)

这篇关于将文件从 URL 传输到 Cloud Storage的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆