Google Cloud Storage + Python:是否可以在GCS的某些文件夹中列出obj? [英] Google Cloud Storage + Python : Any way to list obj in certain folder in GCS?

查看:94
本文介绍了Google Cloud Storage + Python:是否可以在GCS的某些文件夹中列出obj?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将编写一个Python程序来检查文件是否在我的Google Cloud Storage的某些文件夹中,基本思想是获取文件夹中所有对象的list,文件名list ,然后检查文件abc.txt是否在文件名list中.

I'm going to write a Python program to check if a file is in certain folder of my Google Cloud Storage, the basic idea is to get the list of all objects in a folder, a file name list, then check if the file abc.txt is in the file name list.

现在的问题是,看来Google只提供一种获取obj list的方法,即uri.get_bucket(),请参见下面的代码,该代码来自

Now the problem is, it looks Google only provide the one way to get obj list, which is uri.get_bucket(), see below code which is from https://developers.google.com/storage/docs/gspythonlibrary#listing-objects

uri = boto.storage_uri(DOGS_BUCKET, GOOGLE_STORAGE)
for obj in uri.get_bucket():
    print '%s://%s/%s' % (uri.scheme, uri.bucket_name, obj.name)
    print '  "%s"' % obj.get_contents_as_string()

uri.get_bucket()的缺陷是,看起来它首先获取了所有对象,这是我不想要的,我只需要获取特定文件夹的obj名称list(例如,),应该很快.

The defect of uri.get_bucket() is, it looks it is getting all of the object first, this is what I don't want, I just need get the obj name list of particular folder(e.g gs//mybucket/abc/myfolder) , which should be much quickly.

有人可以帮忙回答吗?感谢每个答案!

Could someone help answer? Appreciate every answer!

推荐答案

更新:对于适用于Python的旧版"Google API客户端库",以下内容适用,但如果您不使用该功能客户端,请选择适用于Python的较新的"Google Cloud Client库"( https://googleapis.dev/python/storage/latest/index.html ).对于较新的库,等效于以下代码的是:

Update: the below is true for the older "Google API Client Libraries" for Python, but if you're not using that client, prefer the newer "Google Cloud Client Library" for Python ( https://googleapis.dev/python/storage/latest/index.html ). For the newer library, the equivalent to the below code is:

from google.cloud import storage

client = storage.Client()
for blob in client.list_blobs('bucketname', prefix='abc/myfolder'):
  print(str(blob))

以下是针对较老客户的答案.

Answer for older client follows.

您可能会发现使用具有完整功能的Python客户端的JSON API更容易.它具有列出带有前缀参数的对象的功能,您可以通过以下方式使用该参数检查某个目录及其子目录:

You may find it easier to work with the JSON API, which has a full-featured Python client. It has a function for listing objects that takes a prefix parameter, which you could use to check for a certain directory and its children in this manner:

from apiclient import discovery

# Auth goes here if necessary. Create authorized http object...
client = discovery.build('storage', 'v1') # add http=whatever param if auth
request = client.objects().list(
    bucket="mybucket",
    prefix="abc/myfolder")
while request is not None:
  response = request.execute()
  print json.dumps(response, indent=2)
  request = request.list_next(request, response)

列表调用的完整文档位于此处: https://developers .google.com/storage/docs/json_api/v1/objects/list

Fuller documentation of the list call is here: https://developers.google.com/storage/docs/json_api/v1/objects/list

此处记录了Google Python API客户端: https://code.google.com/p/google-api-python-客户/

And the Google Python API client is documented here: https://code.google.com/p/google-api-python-client/

这篇关于Google Cloud Storage + Python:是否可以在GCS的某些文件夹中列出obj?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆