如何使用Google Colab TPU连接到私有存储桶 [英] How to connect to private storage bucket using the Google Colab TPU
问题描述
我正在使用google colab专业版和提供的TPU.我需要将预先训练的模型上传到TPU.
I am using google colab pro and the provided TPU. I need to upload a pre-trained model into the TPU.
- TPU只能从Google云存储桶中加载数据.
- 我创建了一个云存储桶,并在该桶中提取了预先训练的模型文件.
现在,我需要向TPU授予访问我的私有存储桶的权限,但是我不知道TPU的服务帐户.我怎么找到它?
Now I need to give permission to the TPU to access my private bucket, but I don't know the service account of the TPU. How do I find it?
目前,我只具有对存储桶的All:R
读取权限,并且TPU已成功初始化,但是显然这不是最佳解决方案.
For now I just have All:R
read permission to the bucket and the TPU initialized successfully but clearly this is not the optimal solution.
推荐答案
我自己一直在努力解决此问题(尽管使用了Colab的免费版本),并且使其能够正常工作.这个特定的用例似乎没有很好的文档说明-似乎官方文档主要处理涉及Compute Engine VM而不是自动分配的TPU的案例.对我有用的过程如下:
I've been struggling with this scenario myself (although with the free version of Colab) and just got it to work. This specific use case doesn't appear to be very well-documented—it seems the official documentation mostly deals with cases involving a Compute Engine VM, rather than an auto-assigned TPU. The process that worked for me went as follows:
- 运行Google Cloud SDK身份验证并设置项目(这两项可能是多余的,我还没有尝试做另一件事)
!gcloud auth login
!gcloud config set project [Project ID of Storage Bucket]
和
from google.colab import auth
auth.authenticate_user()
- 初始化TPU(来自 Tensorflow TPU文档)
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)
- 尝试加载模型
model = tf.keras.models.load_model('gs://[Bucket name and path to saved model]')
此操作最初失败,但是错误消息包含试图访问目录的TPU的服务帐户,这是我授予访问权限的地址,如
This initially failed, but the error message included the service account of the TPU trying to access the directory, and this is the address I gave access to as described in the Cloud Storage docs. The address is in the
service-[PROJECT_NUMBER]@cloud-tpu.iam.gserviceaccount.com
format but the project number isn't the Project ID of the project my bucket is in, nor a value I've been able to find anywhere else.
在授予该服务帐户权限后(我只能在错误消息中找到该权限),我便能够从我的私有存储桶中加载和保存模型.
After I gave permissions to that service account (which I was only able to find in the error message), I was able to load and save models from my private bucket.
这篇关于如何使用Google Colab TPU连接到私有存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!