如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类? [英] How to classify a QuickDraw doodle using TensorFlow's sketch RNN tutorial?
问题描述
说明:
- 这个问题与 QuickDraw RNN绘图分类tensorflow教程有关,不是 文本RNN张量流教程
- 为了扩展,这是
- This question is regarding this QuickDraw RNN Drawing classification tensorflow tutorial, not the text RNN tensorflow tutorial
- To an extend this is a duplicate of Farooq Khan's question, however I could use a few more specific details (that otherwise easily become cumbersome comments) and take the opportunity to reward Farooq for taking his time to provide further details.
我正在使用NVIDIA GeForce GT 750M 2048 MB GPU的Macbook上运行从源代码编译并具有GPU支持的tensorflow 1.6.0-rc0.
I'm running tensorflow 1.6.0-rc0 compiled from source with GPU support on a Macbook with an NVIDIA GeForce GT 750M 2048 MB GPU.
我试图像这样训练:
python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00000-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00000-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm
我正在寻找的初步澄清是:
The initial clarifications I'm looking for are:
- 我应该使用上面的命令,然后在完成后运行:
python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00001-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00001-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm
至python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00009-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00009-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm
,还是应该按原样运行教程中提到的命令:python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
- 我怎么知道培训何时完成? (这些是上次培训课程中的最后一条消息:
2018-04-11 01:43:27.180805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0 2018-04-11 01:43:27.180860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-11 01:43:27.180866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-11 01:43:27.180869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-11 01:43:27.180950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 100 MB memory) -> physical GPU (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0, compute capability: 3.0)
之后没有错误或其他任何输出:很难将这些消息与之前的其他检查点区分开来) - 如何通过自定义涂鸦进行分类?这是我的问题的核心. Farooq的回答
create_tfrecord_for_prediction
很棒:可以运行/测试的完整脚本太棒了
- should I use the command above, then once it completes run :
python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00001-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00001-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm
through topython train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00009-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00009-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm
or should I run the command mentioned in the tutorial as-is:python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
- how do I know when training is complete ? (these are the last messages from the last training session:
2018-04-11 01:43:27.180805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0 2018-04-11 01:43:27.180860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-11 01:43:27.180866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-11 01:43:27.180869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-11 01:43:27.180950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 100 MB memory) -> physical GPU (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0, compute capability: 3.0)
No error or any other output afterwards: hard to distinguish those messages from other previous checkpoints) - how to pass a custom doodle for classification ? This is the core of my question. Farooq's
create_tfrecord_for_prediction
in his answer is great: a full script to run/test would be amazing
Update2
感谢Farooq的帮助说明,下面是经过调整的代码版本,可在控制台上显示预测:
Thanks to Farooq's helpful notes, here is a tweaked version of the code that prints a prediction to console:
# Copyright 2017 The TensorFlow Authors. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================== r"""Binary for trianing a RNN-based classifier for the Quick, Draw! data. python train_model.py \ --training_data train_data \ --eval_data eval_data \ --model_dir /tmp/quickdraw_model/ \ --cell_type cudnn_lstm When running on GPUs using --cell_type cudnn_lstm is much faster. The expected performance is ~75% in 1.5M steps with the default configuration. """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import argparse import ast import functools import sys from datetime import datetime import json import numpy as np import tensorflow as tf def get_num_classes(): classes = [] with tf.gfile.GFile(FLAGS.classes_file, "r") as f: classes = [x for x in f] num_classes = len(classes) return num_classes def get_input_fn(mode, tfrecord_pattern, batch_size): """Creates an input_fn that stores all the data in memory. Args: mode: one of tf.contrib.learn.ModeKeys.{TRAIN, INFER, EVAL} tfrecord_pattern: path to a TF record file created using create_dataset.py. batch_size: the batch size to output. Returns: A valid input_fn for the model estimator. """ def _parse_tfexample_fn(example_proto, mode): """Parse a single record which is expected to be a tensorflow.Example.""" feature_to_type = { "ink": tf.VarLenFeature(dtype=tf.float32), "shape": tf.FixedLenFeature([2], dtype=tf.int64) } if mode != tf.estimator.ModeKeys.PREDICT: # The labels won't be available at inference time, so don't add them # to the list of feature_columns to be read. feature_to_type["class_index"] = tf.FixedLenFeature([1], dtype=tf.int64) parsed_features = tf.parse_single_example(example_proto, feature_to_type) parsed_features["ink"] = tf.sparse_tensor_to_dense(parsed_features["ink"]) if mode != tf.estimator.ModeKeys.PREDICT: labels = parsed_features["class_index"] return parsed_features, labels else: return parsed_features # In prediction, we have no labels def _input_fn(): """Estimator `input_fn`. Returns: A tuple of: - Dictionary of string feature name to `Tensor`. - `Tensor` of target labels. """ dataset = tf.data.TFRecordDataset.list_files(tfrecord_pattern) if mode == tf.estimator.ModeKeys.TRAIN: dataset = dataset.shuffle(buffer_size=10) dataset = dataset.repeat() # Preprocesses 10 files concurrently and interleaves records from each file. dataset = dataset.interleave( tf.data.TFRecordDataset, cycle_length=10, block_length=1) dataset = dataset.map( functools.partial(_parse_tfexample_fn, mode=mode), num_parallel_calls=10) dataset = dataset.prefetch(10000) if mode == tf.estimator.ModeKeys.TRAIN: dataset = dataset.shuffle(buffer_size=1000000) # Our inputs are variable length, so pad them. dataset = dataset.padded_batch( batch_size, padded_shapes=dataset.output_shapes) iter = dataset.make_one_shot_iterator() if mode != tf.estimator.ModeKeys.PREDICT: features, labels = iter.get_next() return features, labels else: features = iter.get_next() return features, None # In prediction, we have no labels return _input_fn def model_fn(features, labels, mode, params): """Model function for RNN classifier. This function sets up a neural network which applies convolutional layers (as configured with params.num_conv and params.conv_len) to the input. The output of the convolutional layers is given to LSTM layers (as configured with params.num_layers and params.num_nodes). The final state of the all LSTM layers are concatenated and fed to a fully connected layer to obtain the final classification scores. Args: features: dictionary with keys: inks, lengths. labels: one hot encoded classes mode: one of tf.estimator.ModeKeys.{TRAIN, INFER, EVAL} params: a parameter dictionary with the following keys: num_layers, num_nodes, batch_size, num_conv, conv_len, num_classes, learning_rate. Returns: ModelFnOps for Estimator API. """ def _get_input_tensors(features, labels): """Converts the input dict into inks, lengths, and labels tensors.""" # features[ink] is a sparse tensor that is [8, batch_maxlen, 3] # inks will be a dense tensor of [8, maxlen, 3] # shapes is [batchsize, 2] shapes = features["shape"] # lengths will be [batch_size] lengths = tf.squeeze( tf.slice(shapes, begin=[0, 0], size=[params.batch_size, 1])) inks = tf.reshape(features["ink"], [params.batch_size, -1, 3]) if labels is not None: labels = tf.squeeze(labels) return inks, lengths, labels def _add_conv_layers(inks, lengths): """Adds convolution layers.""" convolved = inks for i in range(len(params.num_conv)): convolved_input = convolved if params.batch_norm: convolved_input = tf.layers.batch_normalization( convolved_input, training=(mode == tf.estimator.ModeKeys.TRAIN)) # Add dropout layer if enabled and not first convolution layer. if i > 0 and params.dropout: convolved_input = tf.layers.dropout( convolved_input, rate=params.dropout, training=(mode == tf.estimator.ModeKeys.TRAIN)) convolved = tf.layers.conv1d( convolved_input, filters=params.num_conv[i], kernel_size=params.conv_len[i], activation=None, strides=1, padding="same", name="conv1d_%d" % i) return convolved, lengths def _add_regular_rnn_layers(convolved, lengths): """Adds RNN layers.""" if params.cell_type == "lstm": cell = tf.nn.rnn_cell.BasicLSTMCell elif params.cell_type == "block_lstm": cell = tf.contrib.rnn.LSTMBlockCell cells_fw = [cell(params.num_nodes) for _ in range(params.num_layers)] cells_bw = [cell(params.num_nodes) for _ in range(params.num_layers)] if params.dropout > 0.0: cells_fw = [tf.contrib.rnn.DropoutWrapper(cell) for cell in cells_fw] cells_bw = [tf.contrib.rnn.DropoutWrapper(cell) for cell in cells_bw] outputs, _, _ = tf.contrib.rnn.stack_bidirectional_dynamic_rnn( cells_fw=cells_fw, cells_bw=cells_bw, inputs=convolved, sequence_length=lengths, dtype=tf.float32, scope="rnn_classification") return outputs def _add_cudnn_rnn_layers(convolved): """Adds CUDNN LSTM layers.""" # Convolutions output [B, L, Ch], while CudnnLSTM is time-major. convolved = tf.transpose(convolved, [1, 0, 2]) lstm = tf.contrib.cudnn_rnn.CudnnLSTM( num_layers=params.num_layers, num_units=params.num_nodes, dropout=params.dropout if mode == tf.estimator.ModeKeys.TRAIN else 0.0, direction="bidirectional") outputs, _ = lstm(convolved) # Convert back from time-major outputs to batch-major outputs. outputs = tf.transpose(outputs, [1, 0, 2]) return outputs def _add_rnn_layers(convolved, lengths): """Adds recurrent neural network layers depending on the cell type.""" if params.cell_type != "cudnn_lstm": outputs = _add_regular_rnn_layers(convolved, lengths) else: outputs = _add_cudnn_rnn_layers(convolved) # outputs is [batch_size, L, N] where L is the maximal sequence length and N # the number of nodes in the last layer. mask = tf.tile( tf.expand_dims(tf.sequence_mask(lengths, tf.shape(outputs)[1]), 2), [1, 1, tf.shape(outputs)[2]]) zero_outside = tf.where(mask, outputs, tf.zeros_like(outputs)) outputs = tf.reduce_sum(zero_outside, axis=1) return outputs def _add_fc_layers(final_state): """Adds a fully connected layer.""" return tf.layers.dense(final_state, params.num_classes) # Build the model. inks, lengths, labels = _get_input_tensors(features, labels) convolved, lengths = _add_conv_layers(inks, lengths) final_state = _add_rnn_layers(convolved, lengths) logits = _add_fc_layers(final_state) # Compute current predictions. predictions = tf.argmax(logits, axis=1) if mode == tf.estimator.ModeKeys.PREDICT: preds = { "class_index": predictions, #"class_index": predictions[:, tf.newaxis], "probabilities": tf.nn.softmax(logits), "logits": logits } #preds = {"logits": logits, "predictions": predictions} return tf.estimator.EstimatorSpec(mode, predictions=preds) # Add the loss. cross_entropy = tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits( labels=labels, logits=logits)) # Add the optimizer. train_op = tf.contrib.layers.optimize_loss( loss=cross_entropy, global_step=tf.train.get_global_step(), learning_rate=params.learning_rate, optimizer="Adam", # some gradient clipping stabilizes training in the beginning. clip_gradients=params.gradient_clipping_norm, summaries=["learning_rate", "loss", "gradients", "gradient_norm"]) return tf.estimator.EstimatorSpec( mode=mode, predictions={"logits": logits, "predictions": predictions}, loss=cross_entropy, train_op=train_op, eval_metric_ops={"accuracy": tf.metrics.accuracy(labels, predictions)}) def create_estimator_and_specs(run_config): """Creates an Experiment configuration based on the estimator and input fn.""" model_params = tf.contrib.training.HParams( num_layers=FLAGS.num_layers, num_nodes=FLAGS.num_nodes, batch_size=FLAGS.batch_size, num_conv=ast.literal_eval(FLAGS.num_conv), conv_len=ast.literal_eval(FLAGS.conv_len), num_classes=get_num_classes(), learning_rate=FLAGS.learning_rate, gradient_clipping_norm=FLAGS.gradient_clipping_norm, cell_type=FLAGS.cell_type, batch_norm=FLAGS.batch_norm, dropout=FLAGS.dropout) estimator = tf.estimator.Estimator( model_fn=model_fn, config=run_config, params=model_params) train_spec = tf.estimator.TrainSpec(input_fn=get_input_fn( mode=tf.estimator.ModeKeys.TRAIN, tfrecord_pattern=FLAGS.training_data, batch_size=FLAGS.batch_size), max_steps=FLAGS.steps) eval_spec = tf.estimator.EvalSpec(input_fn=get_input_fn( mode=tf.estimator.ModeKeys.EVAL, tfrecord_pattern=FLAGS.eval_data, batch_size=FLAGS.batch_size)) return estimator, train_spec, eval_spec # def main(unused_args): # estimator, train_spec, eval_spec = create_estimator_and_specs( # run_config=tf.estimator.RunConfig( # model_dir=FLAGS.model_dir, # save_checkpoints_secs=300, # save_summary_steps=100)) # tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) def create_tfrecord_for_prediction(batch_size, stoke_data, tfrecord_file): def parse_line(stoke_data): """Parse provided stroke data and ink (as np array) and classname.""" inkarray = json.loads(stoke_data) stroke_lengths = [len(stroke[0]) for stroke in inkarray] total_points = sum(stroke_lengths) np_ink = np.zeros((total_points, 3), dtype=np.float32) current_t = 0 for stroke in inkarray: if len(stroke[0]) != len(stroke[1]): print("Inconsistent number of x and y coordinates.") return None for i in [0, 1]: np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i] current_t += len(stroke[0]) np_ink[current_t - 1, 2] = 1 # stroke_end # Preprocessing. # 1. Size normalization. lower = np.min(np_ink[:, 0:2], axis=0) upper = np.max(np_ink[:, 0:2], axis=0) scale = upper - lower scale[scale == 0] = 1 np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale # 2. Compute deltas. #np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2] #np_ink = np_ink[1:, :] np_ink[1:, 0:2] -= np_ink[0:-1, 0:2] np_ink = np_ink[1:, :] features = {} features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten())) features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape)) f = tf.train.Features(feature=features) ex = tf.train.Example(features=f) return ex if stoke_data is None: print("Error: Stroke data cannot be none") return example = parse_line(stoke_data) #Remove the file if it already exists if tf.gfile.Exists(tfrecord_file): tf.gfile.Remove(tfrecord_file) writer = tf.python_io.TFRecordWriter(tfrecord_file) for i in range(batch_size): writer.write(example.SerializeToString()) writer.flush() writer.close() print ('wrote',tfrecord_file) def get_classes(): classes = [] with tf.gfile.GFile(FLAGS.classes_file, "r") as f: classes = [x.rstrip() for x in f] return classes def main(unused_args): print("%s: I Starting application" % (datetime.now())) print("FLAGS",FLAGS) estimator, train_spec, eval_spec = create_estimator_and_specs( run_config=tf.estimator.RunConfig( model_dir=FLAGS.model_dir, save_checkpoints_secs=300, save_summary_steps=100)) print("estimator",estimator,"train_spec",train_spec,"eval_spec",eval_spec) tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) if FLAGS.predict_for_data != None: print("%s: I Starting prediction" % (datetime.now())) class_names = get_classes() create_tfrecord_for_prediction(FLAGS.batch_size, FLAGS.predict_for_data, FLAGS.predict_temp_file) predict_results = estimator.predict(input_fn=get_input_fn( mode=tf.estimator.ModeKeys.PREDICT, tfrecord_pattern=FLAGS.predict_temp_file, batch_size=FLAGS.batch_size)) #predict_results = estimator.predict(input_fn=predict_input_fn) for idx, prediction in enumerate(predict_results): index = prediction["class_index"] # Get the predicted class (index) probability = prediction["probabilities"][index] class_name = class_names[index] print("%s: Predicted Class is: %s with a probability of %f" % (datetime.now(), class_name, probability)) break #We care for only the first prediction, rest are all duplicates just to meet the batch size if __name__ == "__main__": parser = argparse.ArgumentParser() parser.register("type", "bool", lambda v: v.lower() == "true") parser.add_argument( "--training_data", type=str, default="", help="Path to training data (tf.Example in TFRecord format)") parser.add_argument( "--eval_data", type=str, default="", help="Path to evaluation data (tf.Example in TFRecord format)") parser.add_argument( "--classes_file", type=str, default="", help="Path to a file with the classes - one class per line") parser.add_argument( "--num_layers", type=int, default=3, help="Number of recurrent neural network layers.") parser.add_argument( "--num_nodes", type=int, default=128, help="Number of node per recurrent network layer.") parser.add_argument( "--num_conv", type=str, default="[48, 64, 96]", help="Number of conv layers along with number of filters per layer.") parser.add_argument( "--conv_len", type=str, default="[5, 5, 3]", help="Length of the convolution filters.") parser.add_argument( "--cell_type", type=str, default="lstm", help="Cell type used for rnn layers: cudnn_lstm, lstm or block_lstm.") parser.add_argument( "--batch_norm", type="bool", default="False", help="Whether to enable batch normalization or not.") parser.add_argument( "--learning_rate", type=float, default=0.0001, help="Learning rate used for training.") parser.add_argument( "--gradient_clipping_norm", type=float, default=9.0, help="Gradient clipping norm used during training.") parser.add_argument( "--dropout", type=float, default=0.3, help="Dropout used for convolutions and bidi lstm layers.") parser.add_argument( "--steps", type=int, default=100000, help="Number of training steps.") parser.add_argument( "--batch_size", type=int, default=8, help="Batch size to use for training/evaluation.") parser.add_argument( "--model_dir", type=str, default="", help="Path for storing the model checkpoints.") parser.add_argument( "--self_test", type=bool, default="False", help="Whether to enable batch normalization or not.") parser.add_argument( "--predict_for_data", type=str, default="[[[73,66,46,23,12,11,22,48,58,67,70,65],[11,6,2,10,23,33,48,56,54,41,22,10]],[[66,85,71],[9,3,26]],[[24,1,2,8],[6,1,10,19]],[[64,88,134,176,180,184,184,174,111,63,47],[34,29,28,35,39,58,91,94,86,71,62]],[[64,61,62],[74,83,102]],[[83,84,87],[78,102,107]],[[157,159,164],[96,108,116]],[[175,182],[91,115]],[[182,186,198,209,223,234,251,255],[51,36,29,30,38,39,20,8]],[[157,136,128,133,139],[35,47,57,35,29]],[[104,94,84,84,89],[40,52,70,30,26]],[[111,105,105,109,121],[30,59,68,72,34]],[[159,153,153],[41,54,65]]]", help=".ndjson single line .drawing (e.g. just the strokes, no labels)") parser.add_argument( "--predict_temp_file", type=str, default="./predict_temp.tfrecord", help="path to a temporary tfrecord that will be created from the .ndjson drawing data") FLAGS, unparsed = parser.parse_known_args() tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
我已经像这样运行以上内容:
I've ran the above like so:
python classify.py --classes_file=rnn_tutorial_data/training.tfrecord.classes --model_dir=model_gpu_all/ --training_data=./rnn_tutorial_data/training.tfrecord-?????-of-????? --eval_data=./rnn_tutorial_data/eval.tfrecord-?????-of-????? --predict_for_data="[[[73,66,46,23,12,11,22,48,58,67,70,65],[11,6,2,10,23,33,48,56,54,41,22,10]],[[66,85,71],[9,3,26]],[[24,1,2,8],[6,1,10,19]],[[64,88,134,176,180,184,184,174,111,63,47],[34,29,28,35,39,58,91,94,86,71,62]],[[64,61,62],[74,83,102]],[[83,84,87],[78,102,107]],[[157,159,164],[96,108,116]],[[175,182],[91,115]],[[182,186,198,209,223,234,251,255],[51,36,29,30,38,39,20,8]],[[157,136,128,133,139],[35,47,57,35,29]],[[104,94,84,84,89],[40,52,70,30,26]],[[111,105,105,109,121],[30,59,68,72,34]],[[159,153,153],[41,54,65]]]" --predict_temp_file=./predict_temp.tfrecord --cell_type=cudnn_lstm
最后得到一个预测:
Predicted Class is: cow with a probability of 0.533384
不是一个很好的例子(教程会警告数据集的大小和准确性),但这只是一个预测,是的!在此示例中,完全执行耗时31秒.
Not a great one (tutorial warns on dataset size and accuracy), but it's a prediction, yaaay! Full execution took 31 seconds with this example.
推荐答案
python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
使用上述命令的AFAIK也可以正常工作,它将简单地读取您一次下载数据文件的文件夹中的所有文件.
AFAIK using the above command works as well, it will simply read all the files in the folder where you have download the data files one by one.
create_tfrecord_for_prediction
当然不是我自己创建的,该代码主要是从tensorflow的其他文件中提取的create_dataset.py
create_tfrecord_for_prediction
is certainly not my own creation, this code was mostly picked from another file from tensorflow guyscreate_dataset.py
下面,我粘贴了我添加的几乎所有新代码,包括对
main()
函数的修改Below I have pasted the almost all of the new code i added including the my modifications to the
main()
functiondef create_tfrecord_for_prediction(batch_size, stoke_data, tfrecord_file): def parse_line(stoke_data): """Parse provided stroke data and ink (as np array) and classname.""" inkarray = json.loads(stoke_data) stroke_lengths = [len(stroke[0]) for stroke in inkarray] total_points = sum(stroke_lengths) np_ink = np.zeros((total_points, 3), dtype=np.float32) current_t = 0 for stroke in inkarray: if len(stroke[0]) != len(stroke[1]): print("Inconsistent number of x and y coordinates.") return None for i in [0, 1]: np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i] current_t += len(stroke[0]) np_ink[current_t - 1, 2] = 1 # stroke_end # Preprocessing. # 1. Size normalization. lower = np.min(np_ink[:, 0:2], axis=0) upper = np.max(np_ink[:, 0:2], axis=0) scale = upper - lower scale[scale == 0] = 1 np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale # 2. Compute deltas. #np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2] #np_ink = np_ink[1:, :] np_ink[1:, 0:2] -= np_ink[0:-1, 0:2] np_ink = np_ink[1:, :] features = {} features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten())) features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape)) f = tf.train.Features(feature=features) ex = tf.train.Example(features=f) return ex if stoke_data is None: print("Error: Stroke data cannot be none") return example = parse_line(stoke_data) #Remove the file if it already exists if tf.gfile.Exists(tfrecord_file): tf.gfile.Remove(tfrecord_file) writer = tf.python_io.TFRecordWriter(tfrecord_file) for i in range(batch_size): writer.write(example.SerializeToString()) writer.flush() writer.close() def get_classes(): classes = [] with tf.gfile.GFile(FLAGS.classes_file, "r") as f: classes = [x.rstrip() for x in f] return classes def main(unused_args): print("%s: I Starting application" % (datetime.now())) estimator, train_spec, eval_spec = create_estimator_and_specs( run_config=tf.estimator.RunConfig( model_dir=FLAGS.model_dir, save_checkpoints_secs=300, save_summary_steps=100)) tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) if FLAGS.predict_for_data != None: print("%s: I Starting prediction" % (datetime.now())) class_names = get_classes() create_tfrecord_for_prediction(FLAGS.batch_size, FLAGS.predict_for_data, FLAGS.predict_temp_file) predict_results = estimator.predict(input_fn=get_input_fn( mode=tf.estimator.ModeKeys.PREDICT, tfrecord_pattern=FLAGS.predict_temp_file, batch_size=FLAGS.batch_size)) #predict_results = estimator.predict(input_fn=predict_input_fn) for idx, prediction in enumerate(predict_results): index = prediction["class_index"] # Get the predicted class (index) probability = prediction["probabilities"][index] class_name = class_names[index] print("%s: Predicted Class is: %s with a probability of %f" % (datetime.now(), class_name, probability)) break #We care for only the first prediction, rest are all duplicates just to meet the batch size
FLAGS.predict_for_data
这是保存笔触数据的命令行参数FLAGS.predict_temp_file
只是我用来创建临时输入数据tfrecord文件的文件名FLAGS.predict_for_data
this is the command line argument that holds the strokes dataFLAGS.predict_temp_file
is just a name of file i use to create the temporary input data tfrecord file注意1:与此同时,我还修改了
get_input_fn()
中的一些代码,您可以在此PR中找到此代码更改: https://github.com/tensorflow/models/pull/3440 (尚未合并)Note1 : Along with this I also modified some code in
get_input_fn()
you can find this code change in this PR: https://github.com/tensorflow/models/pull/3440 (has not been merged yet)注2:我还必须修改model_fn()并在注释
#Compute current predictions
Note2: I also had to modify model_fn() and add the below few lines my additions are after the comment
#Compute current predictions
# Build the model. inks, lengths, labels = _get_input_tensors(features, labels) convolved, lengths = _add_conv_layers(inks, lengths) final_state = _add_rnn_layers(convolved, lengths) logits = _add_fc_layers(final_state) # Compute current predictions. predictions = tf.argmax(logits, axis=1) if mode == tf.estimator.ModeKeys.PREDICT: preds = { "class_index": predictions, #"class_index": predictions[:, tf.newaxis], "probabilities": tf.nn.softmax(logits), "logits": logits } #preds = {"logits": logits, "predictions": predictions} return tf.estimator.EstimatorSpec(mode, predictions=preds) # Add the loss. cross_entropy = tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits( labels=labels, logits=logits)) # Add the optimizer. train_op = tf.contrib.layers.optimize_loss( loss=cross_entropy, global_step=tf.train.get_global_step(), learning_rate=params.learning_rate, optimizer="Adam", # some gradient clipping stabilizes training in the beginning. clip_gradients=params.gradient_clipping_norm, summaries=["learning_rate", "loss", "gradients", "gradient_norm"]) return tf.estimator.EstimatorSpec( mode=mode, predictions={"logits": logits, "predictions": predictions}, loss=cross_entropy, train_op=train_op, eval_metric_ops={"accuracy": tf.metrics.accuracy(labels, predictions)})
然后剩下的唯一事情就是弄清楚生成笔画数据.为此,您可以读取现有的tfrecord文件之一,然后从该读取操作中提取笔画,也可以编写一些javascript网页来生成笔画
The only thing left is then to figure out generating strokes data. For this you can take one of the existing tfrecord file read it and extract a stroke from that read operation or you could write some javascript webpage to generate strokes
这篇关于如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- how do I know when training is complete ? (these are the last messages from the last training session:
- 我怎么知道培训何时完成? (这些是上次培训课程中的最后一条消息: