Google Colab:如何从我的Google驱动器读取数据? [英] Google Colab: how to read data from my google drive?

查看:158
本文介绍了Google Colab:如何从我的Google驱动器读取数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题很简单:我在gDrive上有一些数据,例如在 /projects/my_project/my_data*.

The problem is simple: I have some data on gDrive, for example at /projects/my_project/my_data*.

我在gColab中也有一个简单的笔记本.

Also I have a simple notebook in gColab.

所以,我想做些类似的事情:

So, I would like to do something like:

for file in glob.glob("/projects/my_project/my_data*"):
    do_something(file)

不幸的是,所有示例(例如-- https例如,://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb 建议只将所有必要的数据加载到笔记本中.

Unfortunately, all examples (like this - https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb, for example) suggests to only mainly load all necessary data to notebook.

但是,如果我有很多数据,它可能会非常复杂. 有解决这个问题的机会吗?

But, if I have a lot of pieces of data, it can be quite complicated. Is there any opportunities to solve this issue?

感谢帮助!

推荐答案

好消息, PyDrive 具有一流的水平对CoLab的支持! PyDrive是Google Drive python客户端的包装.这是有关如何从文件夹下载 ALL 文件的示例,类似于使用glob + *:

Good news, PyDrive has first class support on CoLab! PyDrive is a wrapper for the Google Drive python client. Here is an example on how you would download ALL files from a folder, similar to using glob + *:

!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('~/data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters
file_list = drive.ListFile(
    {'q': "'1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk' in parents"}).GetList()

for f in file_list:
  # 3. Create & download by id.
  print('title: %s, id: %s' % (f['title'], f['id']))
  fname = os.path.join(local_download_path, f['title'])
  print('downloading to {}'.format(fname))
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)


with open(fname, 'r') as f:
  print(f.read())

请注意,drive.ListFile的参数是与

Notice that the arguments to drive.ListFile is a dictionary that coincides with the parameters used by Google Drive HTTP API (you can customize the q parameter to be tuned to your use-case).

请注意,在所有情况下,文件/文件夹均由Google云端硬盘上的id编码(窥视 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk ).这就要求您在Google云端硬盘中搜索与您要在其中进行搜索的文件夹相对应的特定ID.

Know that in all cases, files/folders are encoded by id's (peep the 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk) on Google Drive. This requires that you search Google Drive for the specific id corresponding to the folder you want to root your search in.

例如,导航到文件夹"/projects/my_project/my_data" 位于您的Google云端硬盘中.

For example, navigate to the folder "/projects/my_project/my_data" that is located in your Google Drive.

看到它包含一些文件,我们要在其中下载到CoLab.要获取该文件夹的ID以供PyDrive使用,请查看url并提取id参数.在这种情况下,对应于该文件夹的url为:

See that it contains some files, in which we want to download to CoLab. To get the id of the folder in order to use it by PyDrive, look at the url and extract the id parameter. In this case, the url corresponding to the folder was:

id是网址的最后一部分: 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk .

Where the id is the last piece of the url: 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk.

这篇关于Google Colab:如何从我的Google驱动器读取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆