Cloud ML无法在Google云端存储中找到该文件 [英] Cloud ML Unable to find the file on Google Cloud Storage

查看:270
本文介绍了Cloud ML无法在Google云端存储中找到该文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  data_dir = arguments ['data_dir'] 
data = pd.read_csv(data_dir +/train.csv)

我正在使用这些数据在Google Cloud ML上训练我的模型,我可以成功地安排这项工作,但在获取文件时得到以下IO错误:

IOError:文件gs://cloud-bucket/data/train.csv不存在



文件的地址是正确的,因为我上传了它使用上述存储桶中的控制台。此外,Cloud ML正在同一地区工作,并使用与我的存储桶相同的项目进行配置。

GCS不是POSIX文件系统,因此通常不能使用常规文件库来操作GCS上的文件(例如,参见 ),当然包括便利功能,如 pd.read_csv



在熊猫的情况下,您可以传递文件句柄,因此,根据上述文章 ,我推荐使用TensorFlow的文件换行可以从GCS或标准POSIX文件系统中读取,以使您能够在本地和云上运行相同的代码:

  from tensorflow.python.lib.io输入file_io 

data_dir = arguments ['data_dir']
with file_io.FileIO(data_dir +/train.csv)as f:
data = pd.read_csv(f)

通过在本地运行代码来测试代码也可能会有帮助并在提交云作业前传入GCS文件名。


I am reading my data file using the following commands:

data_dir = arguments['data_dir']
data = pd.read_csv(data_dir + "/train.csv")

I am using this data to train my model on Google Cloud ML, I am successfully able to schedule the job but getting the following IO error while fetching the file:

IOError: File gs://cloud-bucket/data/train.csv does not exist

The address of the file is proper as I have uploaded it using the console in the above mentioned bucket. Also the Cloud ML is working in the same region and configured with the same project as my bucket

解决方案

GCS is not a POSIX file system and therefore you cannot typically use "regular" file libraries to manipulate files on GCS (e.g. see this, this, and this), including, of course, convenience functions like pd.read_csv.

In the case of pandas, you can pass a file handle, so, based on the aforementioned post, I recommend using TensorFlow's File wrapper which can read from GCS or standard POSIX file systems to enable you to run the same code locally and on the cloud:

from tensorflow.python.lib.io import file_io

data_dir = arguments['data_dir']
with file_io.FileIO(data_dir + "/train.csv") as f:
  data = pd.read_csv(f)

It might also be helpful to test your code by running it locally and passing in GCS filenames before submitting a cloud job.

这篇关于Cloud ML无法在Google云端存储中找到该文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆