将培训作业提交给Google Cloud ML [英] Submitting a Training Job to Google Cloud ML

查看:103
本文介绍了将培训作业提交给Google Cloud ML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个代码,我想提交给Google云端ml。我已经测试了他们的例子并得到了结果。

$ _ b
$ b

  from __future__ import absolute_import $ b $ from __future__ import division 
from __future__导入print_function

导入tensorflow为tf
导入numpy为np

#数据集
I_TRAINING =/ home / android / Desktop / training。 csv
I_TEST =/home/android/Desktop/test.csv

#加载数据集。
training_set = tf.contrib.learn.datasets.base.load_csv(filename = I_TRAINING,target_dtype = np.int)
test_set = tf.contrib.learn.datasets.base.load_csv(filename = I_TEST, target_dtype = np.int)

#指定所有要素都有实数值数据
feature_columns = [tf.contrib.layers.real_valued_column(,dimension = 2)]

#分别构建3层DNN,10,20,10个单位。
classifier = tf.contrib.learn.DNNClassifier(feature_columns = feature_columns,
hidden_​​units = [10,20,10],
n_classes = 2,
model_dir =/ tmp / my_model)

#拟合模型。
classifier.fit(x = training_set.data,y = training_set.target,steps = 2000)

#评估准确性。
precision_score = classifier.evaluate(x = test_set.data,y = test_set.target)[accuracy]
print('Accuracy:{0:f}'。format(accuracy_score))

#分类两个新的花样。
#new_samples = np.array(
#[[6.4,3.2,4.5,1.5],[5.8,3.1,5.0,1.7]],dtype = float)
#y =分类器.predict(new_samples)
#print('Predictions:{}'.format(str(y)))

这是一个代码,用于训练和创建张量流中的DNN模型。我已经在本地进行了测试并收到了结果。我将这段代码和 init .py文件放在一个名为trainer的文件夹中,并将该文件夹上传到gs:// bucket-ml / second_job / trainer。 Second_job是JOB_NAME。



然后,当我想提交这个作为工作时,我这样做,并得到以下错误:

  gcloud beta ml jobs提交培训$ {JOB_NAME} \ 
--package-path = trainer \
--module-name = trainer.trainer \
--staging-bucket =$ {TRAIN_BUCKET}\
--region = us-central1 \
--train_dir =$ {TRAIN_PATH} /火车

错误:(gcloud.beta.ml.jobs.submit.training)
用户python代码的封装失败,消息:
正在运行sdist
正在运行egg_info
创建trainer.egg-info
编写trainer.egg-info / PKG-INFO
将顶级名称写入trainer.egg-info / top_level.txt
编写dependency_links trainer.egg-info / dependency_links.txt
写清单文件'trainer.egg-info / SOURCES.txt'
错误:程序包目录'trainer'不存在
尝试手动编写设置.py文件放在你的软件包根目录
并重新运行c ommand

我不确定包路径和模块名是否正确。请告诉我该怎么做。 Thanks
Regards,

解决方案

- package-path gcloud命令的参数应指向一个有效的Python包的目录,即包含 __ init __。py 文件(通常为空文件)的目录。请注意,它应该是一个本地目录,而不是一个在GCS上。



- module 参数将是该包中有效Python模块的完全限定名称。您可以根据自己的需要组织目录,但为了保持一致性,所有示例都有一个名为 trainer 的Python包,其中要运行的模块名称为 task.py



示例的目录结构如下所示:

 训练员/ 
__init__.py
task.py

__ init __。py 可能是一个空文件。 task.py 包含您的代码。然后您可以按以下方式提交您的工作:

  gcloud beta ml jobs提交培训$ {JOB_NAME} \ 
- -package-path = trainer \
--module-name = trainer.task \
--staging-bucket =$ {TRAIN_BUCKET}\
--region = us -central1 \
- \
--train_dir =$ {TRAIN_PATH} / train

你可以为你的包和模块选择你想要的任何名字,只要确保磁盘上的名字和gcloud参数匹配:顶级目录是 - package-path 以及要运行的代码的文件是 - module (没有 .py 后缀)。



一些注释:


  • 注意额外的' - \。这表明所有以下参数应传递给您的程序。也就是说,--train_dir不是gcloud beta ml作业提交培训的参数,并会作为标志传递给您的程序。

  • 如果您打算使用train_dir,则需要在您的代码中添加一些标记解析,例如使用argparse。
  • 您在云中读取的文件需要位于GCS上。
  • 虽然标记解析给你更多的灵活性,这不是必需的。您可以将路径硬编码为文件名。确保它们指向GCS上的对象(然后从gcloud调用中移除 - train_dir


I have a code as below that I want to submit to Google cloud ml. I already tested their example and got results.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np

# Data sets
I_TRAINING = "/home/android/Desktop/training.csv"
I_TEST = "/home/android/Desktop/test.csv"

# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv(filename=I_TRAINING, target_dtype=np.int)
test_set = tf.contrib.learn.datasets.base.load_csv(filename=I_TEST, target_dtype=np.int)

# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=2)]

# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=2,
                                            model_dir="/tmp/my_model")

# Fit model.
classifier.fit(x=training_set.data, y=training_set.target, steps=2000)

# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=test_set.data, y=test_set.target)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))

# Classify two new flower samples.
#new_samples = np.array(
 #   [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float)
#y = classifier.predict(new_samples)
#print('Predictions: {}'.format(str(y)))

It's a code to train and create a DNN model in tensorflow. I already tested it locally and received results. I put this code in a folder named trainer along with init.py file, and uploaded the folder to gs://bucket-ml/second_job/trainer. Second_job is the JOB_NAME.

Then, when I want to submit this as a job, I do this and get the following error:

gcloud beta ml jobs submit training ${JOB_NAME}  \ 
--package-path=trainer   \
--module-name=trainer.trainer   \
--staging-bucket="${TRAIN_BUCKET}"   \
--region=us-central1   \
--train_dir="${TRAIN_PATH}/train"

ERROR: (gcloud.beta.ml.jobs.submit.training) 
    Packaging of user python code failed with message:
      running sdist
running egg_info
creating trainer.egg-info
writing trainer.egg-info/PKG-INFO
writing top-level names to trainer.egg-info/top_level.txt
writing dependency_links to trainer.egg-info/dependency_links.txt
writing manifest file 'trainer.egg-info/SOURCES.txt'
error: package directory 'trainer' does not exist
    Try manually writing a setup.py file at your package root
    and rerunning the command

I am not sure if the package-path and module-name are correct. Please advise me what to do. Thanks Regards,

解决方案

The --package-path argument to the gcloud command should point to a directory that is a valid Python package, i.e., a directory that contains an __init__.py file (often an empty file). Note that it should be a local directory, not one on GCS.

The --module argument will be the fully qualified name of a valid Python module within that package. You can organize your directories however you want, but for the sake of consistency, the samples all have a Python package named trainer with the module to be run named task.py.

The directory structure of the samples look like:

trainer/
  __init__.py
  task.py

__init__.py will likely be an empty file. task.py contains your code. Then you can submit your job as follows:

gcloud beta ml jobs submit training ${JOB_NAME}  \ 
  --package-path=trainer   \
  --module-name=trainer.task   \
  --staging-bucket="${TRAIN_BUCKET}"   \
  --region=us-central1   \
  -- \
  --train_dir="${TRAIN_PATH}/train"

You can choose whatever names you want for your package and modules, just make sure the names on disk and the gcloud arguments match up: top-level directory is --package-path and the file with your code to run is --module (without the .py suffix).

A few notes:

  • Note the extra '-- \'. That indicates that all following arguments should be passed through to your program. That is, --train_dir is NOT an argument to gcloud beta ml jobs submit training and will be passed as a flag to your program
  • If you intend to use train_dir, you'll need to add some flag parsing to your code, e.g., using argparse.
  • Files you read in the cloud need to be on GCS.
  • Although flag parsing gives you more flexibility, it's not required. You can hard code paths to filenames. Just make sure they point to objects on GCS (and then remove the --train_dir from the gcloud call)

这篇关于将培训作业提交给Google Cloud ML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆