在Google Cloud ML中运行作业后出现错误 [英] error after running a job in google cloud ML

查看:95
本文介绍了在Google Cloud ML中运行作业后出现错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试在Google Cloud ML上从github运行word-RNN模型.提交作业后,日志文件中出现错误.

I tried running a word-RNN model from github on Google Cloud ML . After submitting the job,I am getting errors in log file.

这是我提交的培训内容

gcloud ml-engine jobs submit training word_pred_7 \
    --package-path trainer \
    --module-name trainer.train \
    --runtime-version 1.0 \
    --job-dir $JOB_DIR \
    --region $REGION \
    -- \
    --data_dir gs://model-development/arpit/word-rnn-tensorflow-master/data/tinyshakespeare/real1.txt \
    --save_dir gs://model-development/arpit/word-rnn-tensorflow-master/save

这是我在日志文件中看到的.

This is what I get in the log file.

推荐答案

最后,在将77个作业提交到Cloud ML之后,我能够运行该作业,并且提交作业时参数不存在问题.这与文件.npy产生的IO错误有关,这些错误必须使用file_io.FileIo进行存储并读取为StringIO.

Finally, after submitting 77 jobs to cloud ML I am able to run the job and problem was not with the arguments while submitting the job. It was about the IO errors generated by files .npy which have to stores using file_io.FileIo and read as StringIO.

这些IO错误在任何地方都没有提及,因此应该检查它们是否发现没有此类文件或目录的错误.

These IO Errors have not been mentioned anywhere and one should check for them if they find any errors where it says no such file or directory.

这篇关于在Google Cloud ML中运行作业后出现错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆