“无法获取文件系统的路径"在Google Cloud上训练神经网络时出错 [英] "Unable to get Filesystem for path" error when training neural network on google cloud

查看:356
本文介绍了“无法获取文件系统的路径"在Google Cloud上训练神经网络时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Google Cloud在云上训练神经网络,如以下示例所示:

I am using Google Cloud to train a neural network on the cloud like in the following example:

首先,我将以下环境变量设置为

To start I set the following to environmental variables:

PROJECT_ID=$(gcloud config list project --format "value(core.project)")
BUCKET_NAME=${PROJECT_ID}-mlengine

然后,我使用以下命令将训练和评估数据(名称分别为eval_set.csv和train_set.csv的csv)上传到Google云端存储中.

I then uploaded my training and evaluation data, both csv's with the names eval_set.csv and train_set.csv to Google cloud storage with the following command:

gsutil cp -r data gs://$BUCKET_NAME

然后我验证了这两个csv文件在我的Google Cloud存储上的polar-terminal-160506-mlengine/data目录中.

I then verified that these two csv files where in the polar-terminal-160506-mlengine/data directory on my Google Cloud storage.

然后我进行了以下环境变量分配

I then did the following environmental variable assignments

# Assign appropriate values.
PROJECT=$(gcloud config list project --format "value(core.project)")
JOB_ID="flowers_${USER}_$(date +%Y%m%d_%H%M%S)"
GCS_PATH="${BUCKET}/${USER}/${JOB_ID}"
DICT_FILE=gs://cloud-ml-data/img/flower_photos/dict.txt

在尝试像这样预处理我的评估数据之前:

Before trying to preprocess my evaluation data like so:

# Preprocess the eval set.
python trainer/preprocess.py \
  --input_dict "$DICT_FILE" \
  --input_path "gs://cloud-ml-data/img/flower_photos/eval_set.csv" \
  --output_path "${GCS_PATH}/preproc/eval" \
  --cloud

可悲的是,这运行了一段时间,然后崩溃并输出以下错误:

Sadly, this runs for a bit and then crashes outputting the following error:

ValueError: Unable to get the Filesystem for path gs://polar-terminal-160506-mlengine/data/eval_set.csv

这似乎是不可能的,因为我已经通过我的Google Cloud Storage控制台确认了eval_set.csv存储在此位置.这可能是权限问题还是我没有看到的东西?

This doesn't seem possible as I have confirmed with my eyes via my Google Cloud Storage console that eval_set.csv is stored at this location. Is this perhaps a permissions issue or something I am not seeing?

我发现此运行时错误的原因是来自trainer.preprocess.py文件中的某一行.一行是这样的:

I have found the cause of this run time error to be from a certain line in the trainer.preprocess.py file. The line is this one:

read_input_source = beam.io.ReadFromText(
      opt.input_path, strip_trailing_newlines=True)

似乎是一个很好的线索,但是我仍然不确定到底发生了什么.当我在Google上搜索"beam.io.ReadFromText ValueError:无法获取路径的文件系统"时,根本没有任何相关内容出现,这有点奇怪.有想法吗?

Seems like a pretty good clue but I am still not really sure what is going on. When I google "beam.io.ReadFromText ValueError: Unable to get the Filesystem for path" nothing relevant at all appears which is a bit odd. Thoughts?

推荐答案

您的apache-beam库安装似乎不完整.

It looks like your apache-beam library installation might be incomplete.

尝试pip install apache-beam[gcp]

它允许apache Beam访问存储在Google Cloud Storage中的文件.

It allows apache beam to access files stored on Google Cloud Storage.

可用Apache Beam软件包此处

Apache Beam package available here

这篇关于“无法获取文件系统的路径"在Google Cloud上训练神经网络时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆