如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类? [英] How to classify a QuickDraw doodle using TensorFlow's sketch RNN tutorial?

查看：146 发布时间：2020/5/4 9:16:57 python tensorflow machine-learning

本文介绍了如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

说明:

这个问题与 QuickDraw RNN绘图分类tensorflow教程有关，不是文本RNN张量流教程
为了扩展，这是

This question is regarding this QuickDraw RNN Drawing classification tensorflow tutorial, not the text RNN tensorflow tutorial
To an extend this is a duplicate of Farooq Khan's question, however I could use a few more specific details (that otherwise easily become cumbersome comments) and take the opportunity to reward Farooq for taking his time to provide further details.

我正在使用NVIDIA GeForce GT 750M 2048 MB GPU的Macbook上运行从源代码编译并具有GPU支持的tensorflow 1.6.0-rc0.

I'm running tensorflow 1.6.0-rc0 compiled from source with GPU support on a Macbook with an NVIDIA GeForce GT 750M 2048 MB GPU.

我试图像这样训练:

python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00000-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00000-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm

我正在寻找的初步澄清是:

The initial clarifications I'm looking for are:

我应该使用上面的命令，然后在完成后运行:python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00001-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00001-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm至python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00009-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00009-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm，还是应该按原样运行教程中提到的命令:python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
- 我怎么知道培训何时完成? (这些是上次培训课程中的最后一条消息:2018-04-11 01:43:27.180805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0 2018-04-11 01:43:27.180860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-11 01:43:27.180866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-11 01:43:27.180869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-11 01:43:27.180950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 100 MB memory) -> physical GPU (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0, compute capability: 3.0)之后没有错误或其他任何输出:很难将这些消息与之前的其他检查点区分开来)
- 如何通过自定义涂鸦进行分类?这是我的问题的核心. Farooq的回答create_tfrecord_for_prediction很棒:可以运行/测试的完整脚本太棒了
- should I use the command above, then once it completes run :python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00001-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00001-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm through to python train_model.py --model_dir=./model_gpu --training_data=./rnn_tutorial_data/training.tfrecord-00009-of-00010 --eval_data=./rnn_tutorial_data/eval.tfrecord-00009-of-00010 --classes_file=./rnn_tutorial_data/training.tfrecord.classes --cell_type=cudnn_lstm or should I run the command mentioned in the tutorial as-is: python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
  - how do I know when training is complete ? (these are the last messages from the last training session: 2018-04-11 01:43:27.180805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0 2018-04-11 01:43:27.180860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-11 01:43:27.180866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-11 01:43:27.180869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-11 01:43:27.180950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 100 MB memory) -> physical GPU (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0, compute capability: 3.0) No error or any other output afterwards: hard to distinguish those messages from other previous checkpoints)
  - how to pass a custom doodle for classification ? This is the core of my question. Farooq's create_tfrecord_for_prediction in his answer is great: a full script to run/test would be amazing
  Update2
  
  感谢Farooq的帮助说明，下面是经过调整的代码版本，可在控制台上显示预测:
  
  Thanks to Farooq's helpful notes, here is a tweaked version of the code that prints a prediction to console:
```
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

r"""Binary for trianing a RNN-based classifier for the Quick, Draw! data.

python train_model.py \
  --training_data train_data \
  --eval_data eval_data \
  --model_dir /tmp/quickdraw_model/ \
  --cell_type cudnn_lstm

When running on GPUs using --cell_type cudnn_lstm is much faster.

The expected performance is ~75% in 1.5M steps with the default configuration.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import ast
import functools
import sys

from datetime import datetime
import json
import numpy as np


import tensorflow as tf


def get_num_classes():
  classes = []
  with tf.gfile.GFile(FLAGS.classes_file, "r") as f:
    classes = [x for x in f]
  num_classes = len(classes)
  return num_classes


def get_input_fn(mode, tfrecord_pattern, batch_size):
  """Creates an input_fn that stores all the data in memory.

  Args:
   mode: one of tf.contrib.learn.ModeKeys.{TRAIN, INFER, EVAL}
   tfrecord_pattern: path to a TF record file created using create_dataset.py.
   batch_size: the batch size to output.

  Returns:
    A valid input_fn for the model estimator.
  """

  def _parse_tfexample_fn(example_proto, mode):
    """Parse a single record which is expected to be a tensorflow.Example."""
    feature_to_type = {
        "ink": tf.VarLenFeature(dtype=tf.float32),
        "shape": tf.FixedLenFeature([2], dtype=tf.int64)
    }
    if mode != tf.estimator.ModeKeys.PREDICT:
      # The labels won't be available at inference time, so don't add them
      # to the list of feature_columns to be read.
      feature_to_type["class_index"] = tf.FixedLenFeature([1], dtype=tf.int64)

    parsed_features = tf.parse_single_example(example_proto, feature_to_type)
    parsed_features["ink"] = tf.sparse_tensor_to_dense(parsed_features["ink"])

    if mode != tf.estimator.ModeKeys.PREDICT:
      labels = parsed_features["class_index"]
      return parsed_features, labels
    else:
      return parsed_features  # In prediction, we have no labels

  def _input_fn():
    """Estimator `input_fn`.

    Returns:
      A tuple of:
      - Dictionary of string feature name to `Tensor`.
      - `Tensor` of target labels.
    """
    dataset = tf.data.TFRecordDataset.list_files(tfrecord_pattern)
    if mode == tf.estimator.ModeKeys.TRAIN:
      dataset = dataset.shuffle(buffer_size=10)
    dataset = dataset.repeat()
    # Preprocesses 10 files concurrently and interleaves records from each file.
    dataset = dataset.interleave(
        tf.data.TFRecordDataset,
        cycle_length=10,
        block_length=1)
    dataset = dataset.map(
        functools.partial(_parse_tfexample_fn, mode=mode),
        num_parallel_calls=10)
    dataset = dataset.prefetch(10000)
    if mode == tf.estimator.ModeKeys.TRAIN:
      dataset = dataset.shuffle(buffer_size=1000000)
    # Our inputs are variable length, so pad them.
    dataset = dataset.padded_batch(
        batch_size, padded_shapes=dataset.output_shapes)

    iter = dataset.make_one_shot_iterator()
    if mode != tf.estimator.ModeKeys.PREDICT:
        features, labels = iter.get_next()
        return features, labels
    else:
        features = iter.get_next()
        return features, None  # In prediction, we have no labels

  return _input_fn


def model_fn(features, labels, mode, params):
  """Model function for RNN classifier.

  This function sets up a neural network which applies convolutional layers (as
  configured with params.num_conv and params.conv_len) to the input.
  The output of the convolutional layers is given to LSTM layers (as configured
  with params.num_layers and params.num_nodes).
  The final state of the all LSTM layers are concatenated and fed to a fully
  connected layer to obtain the final classification scores.

  Args:
    features: dictionary with keys: inks, lengths.
    labels: one hot encoded classes
    mode: one of tf.estimator.ModeKeys.{TRAIN, INFER, EVAL}
    params: a parameter dictionary with the following keys: num_layers,
      num_nodes, batch_size, num_conv, conv_len, num_classes, learning_rate.

  Returns:
    ModelFnOps for Estimator API.
  """

  def _get_input_tensors(features, labels):
    """Converts the input dict into inks, lengths, and labels tensors."""
    # features[ink] is a sparse tensor that is [8, batch_maxlen, 3]
    # inks will be a dense tensor of [8, maxlen, 3]
    # shapes is [batchsize, 2]
    shapes = features["shape"]
    # lengths will be [batch_size]
    lengths = tf.squeeze(
        tf.slice(shapes, begin=[0, 0], size=[params.batch_size, 1]))
    inks = tf.reshape(features["ink"], [params.batch_size, -1, 3])
    if labels is not None:
      labels = tf.squeeze(labels)
    return inks, lengths, labels

  def _add_conv_layers(inks, lengths):
    """Adds convolution layers."""
    convolved = inks
    for i in range(len(params.num_conv)):
      convolved_input = convolved
      if params.batch_norm:
        convolved_input = tf.layers.batch_normalization(
            convolved_input,
            training=(mode == tf.estimator.ModeKeys.TRAIN))
      # Add dropout layer if enabled and not first convolution layer.
      if i > 0 and params.dropout:
        convolved_input = tf.layers.dropout(
            convolved_input,
            rate=params.dropout,
            training=(mode == tf.estimator.ModeKeys.TRAIN))
      convolved = tf.layers.conv1d(
          convolved_input,
          filters=params.num_conv[i],
          kernel_size=params.conv_len[i],
          activation=None,
          strides=1,
          padding="same",
          name="conv1d_%d" % i)
    return convolved, lengths

  def _add_regular_rnn_layers(convolved, lengths):
    """Adds RNN layers."""
    if params.cell_type == "lstm":
      cell = tf.nn.rnn_cell.BasicLSTMCell
    elif params.cell_type == "block_lstm":
      cell = tf.contrib.rnn.LSTMBlockCell
    cells_fw = [cell(params.num_nodes) for _ in range(params.num_layers)]
    cells_bw = [cell(params.num_nodes) for _ in range(params.num_layers)]
    if params.dropout > 0.0:
      cells_fw = [tf.contrib.rnn.DropoutWrapper(cell) for cell in cells_fw]
      cells_bw = [tf.contrib.rnn.DropoutWrapper(cell) for cell in cells_bw]
    outputs, _, _ = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
        cells_fw=cells_fw,
        cells_bw=cells_bw,
        inputs=convolved,
        sequence_length=lengths,
        dtype=tf.float32,
        scope="rnn_classification")
    return outputs

  def _add_cudnn_rnn_layers(convolved):
    """Adds CUDNN LSTM layers."""
    # Convolutions output [B, L, Ch], while CudnnLSTM is time-major.
    convolved = tf.transpose(convolved, [1, 0, 2])
    lstm = tf.contrib.cudnn_rnn.CudnnLSTM(
        num_layers=params.num_layers,
        num_units=params.num_nodes,
        dropout=params.dropout if mode == tf.estimator.ModeKeys.TRAIN else 0.0,
        direction="bidirectional")
    outputs, _ = lstm(convolved)
    # Convert back from time-major outputs to batch-major outputs.
    outputs = tf.transpose(outputs, [1, 0, 2])
    return outputs

  def _add_rnn_layers(convolved, lengths):
    """Adds recurrent neural network layers depending on the cell type."""
    if params.cell_type != "cudnn_lstm":
      outputs = _add_regular_rnn_layers(convolved, lengths)
    else:
      outputs = _add_cudnn_rnn_layers(convolved)
    # outputs is [batch_size, L, N] where L is the maximal sequence length and N
    # the number of nodes in the last layer.
    mask = tf.tile(
        tf.expand_dims(tf.sequence_mask(lengths, tf.shape(outputs)[1]), 2),
        [1, 1, tf.shape(outputs)[2]])
    zero_outside = tf.where(mask, outputs, tf.zeros_like(outputs))
    outputs = tf.reduce_sum(zero_outside, axis=1)
    return outputs

  def _add_fc_layers(final_state):
    """Adds a fully connected layer."""
    return tf.layers.dense(final_state, params.num_classes)

  # Build the model.
  inks, lengths, labels = _get_input_tensors(features, labels)
  convolved, lengths = _add_conv_layers(inks, lengths)
  final_state = _add_rnn_layers(convolved, lengths)
  logits = _add_fc_layers(final_state)

  # Compute current predictions.
  predictions = tf.argmax(logits, axis=1)

  if mode == tf.estimator.ModeKeys.PREDICT:
      preds = {
          "class_index": predictions,
          #"class_index": predictions[:, tf.newaxis],
          "probabilities": tf.nn.softmax(logits),
          "logits": logits
      }
      #preds = {"logits": logits, "predictions": predictions}

      return tf.estimator.EstimatorSpec(mode, predictions=preds)
      # Add the loss.
  cross_entropy = tf.reduce_mean(
      tf.nn.sparse_softmax_cross_entropy_with_logits(
          labels=labels, logits=logits))

  # Add the optimizer.
  train_op = tf.contrib.layers.optimize_loss(
      loss=cross_entropy,
      global_step=tf.train.get_global_step(),
      learning_rate=params.learning_rate,
      optimizer="Adam",
      # some gradient clipping stabilizes training in the beginning.
      clip_gradients=params.gradient_clipping_norm,
      summaries=["learning_rate", "loss", "gradients", "gradient_norm"])

  return tf.estimator.EstimatorSpec(
      mode=mode,
      predictions={"logits": logits, "predictions": predictions},
      loss=cross_entropy,
      train_op=train_op,
      eval_metric_ops={"accuracy": tf.metrics.accuracy(labels, predictions)})


def create_estimator_and_specs(run_config):
  """Creates an Experiment configuration based on the estimator and input fn."""
  model_params = tf.contrib.training.HParams(
      num_layers=FLAGS.num_layers,
      num_nodes=FLAGS.num_nodes,
      batch_size=FLAGS.batch_size,
      num_conv=ast.literal_eval(FLAGS.num_conv),
      conv_len=ast.literal_eval(FLAGS.conv_len),
      num_classes=get_num_classes(),
      learning_rate=FLAGS.learning_rate,
      gradient_clipping_norm=FLAGS.gradient_clipping_norm,
      cell_type=FLAGS.cell_type,
      batch_norm=FLAGS.batch_norm,
      dropout=FLAGS.dropout)

  estimator = tf.estimator.Estimator(
      model_fn=model_fn,
      config=run_config,
      params=model_params)

  train_spec = tf.estimator.TrainSpec(input_fn=get_input_fn(
      mode=tf.estimator.ModeKeys.TRAIN,
      tfrecord_pattern=FLAGS.training_data,
      batch_size=FLAGS.batch_size), max_steps=FLAGS.steps)

  eval_spec = tf.estimator.EvalSpec(input_fn=get_input_fn(
      mode=tf.estimator.ModeKeys.EVAL,
      tfrecord_pattern=FLAGS.eval_data,
      batch_size=FLAGS.batch_size))

  return estimator, train_spec, eval_spec


# def main(unused_args):
#   estimator, train_spec, eval_spec = create_estimator_and_specs(
#       run_config=tf.estimator.RunConfig(
#           model_dir=FLAGS.model_dir,
#           save_checkpoints_secs=300,
#           save_summary_steps=100))
#   tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
def create_tfrecord_for_prediction(batch_size, stoke_data, tfrecord_file):
    def parse_line(stoke_data):
        """Parse provided stroke data and ink (as np array) and classname."""
        inkarray = json.loads(stoke_data)
        stroke_lengths = [len(stroke[0]) for stroke in inkarray]
        total_points = sum(stroke_lengths)
        np_ink = np.zeros((total_points, 3), dtype=np.float32)
        current_t = 0
        for stroke in inkarray:
            if len(stroke[0]) != len(stroke[1]):
                print("Inconsistent number of x and y coordinates.")
                return None
            for i in [0, 1]:
                np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i]
            current_t += len(stroke[0])
            np_ink[current_t - 1, 2] = 1  # stroke_end
        # Preprocessing.
        # 1. Size normalization.
        lower = np.min(np_ink[:, 0:2], axis=0)
        upper = np.max(np_ink[:, 0:2], axis=0)
        scale = upper - lower
        scale[scale == 0] = 1
        np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale
        # 2. Compute deltas.
        #np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2]
        #np_ink = np_ink[1:, :]
        np_ink[1:, 0:2] -= np_ink[0:-1, 0:2]
        np_ink = np_ink[1:, :]

        features = {}
        features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten()))
        features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape))
        f = tf.train.Features(feature=features)
        ex = tf.train.Example(features=f)
        return ex

    if stoke_data is None:
        print("Error: Stroke data cannot be none")
        return

    example = parse_line(stoke_data)

    #Remove the file if it already exists
    if tf.gfile.Exists(tfrecord_file):
        tf.gfile.Remove(tfrecord_file)

    writer = tf.python_io.TFRecordWriter(tfrecord_file)
    for i in range(batch_size):
        writer.write(example.SerializeToString())
    writer.flush()
    writer.close()
    print ('wrote',tfrecord_file)

def get_classes():
  classes = []
  with tf.gfile.GFile(FLAGS.classes_file, "r") as f:
    classes = [x.rstrip() for x in f]
  return classes

def main(unused_args):
  print("%s: I Starting application" % (datetime.now()))
  print("FLAGS",FLAGS)
  estimator, train_spec, eval_spec = create_estimator_and_specs(
      run_config=tf.estimator.RunConfig(
          model_dir=FLAGS.model_dir,
          save_checkpoints_secs=300,
          save_summary_steps=100))
  print("estimator",estimator,"train_spec",train_spec,"eval_spec",eval_spec) 
  tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

  if FLAGS.predict_for_data != None:
      print("%s: I Starting prediction" % (datetime.now()))
      class_names = get_classes()
      create_tfrecord_for_prediction(FLAGS.batch_size, FLAGS.predict_for_data, FLAGS.predict_temp_file)
      predict_results = estimator.predict(input_fn=get_input_fn(
          mode=tf.estimator.ModeKeys.PREDICT,
          tfrecord_pattern=FLAGS.predict_temp_file,
          batch_size=FLAGS.batch_size))

      #predict_results = estimator.predict(input_fn=predict_input_fn)
      for idx, prediction in enumerate(predict_results):
          index = prediction["class_index"]  # Get the predicted class (index)
          probability = prediction["probabilities"][index]
          class_name = class_names[index]
          print("%s: Predicted Class is: %s with a probability of %f" % (datetime.now(), class_name, probability))
          break #We care for only the first prediction, rest are all duplicates just to meet the batch size


if __name__ == "__main__":
  parser = argparse.ArgumentParser()
  parser.register("type", "bool", lambda v: v.lower() == "true")
  parser.add_argument(
      "--training_data",
      type=str,
      default="",
      help="Path to training data (tf.Example in TFRecord format)")
  parser.add_argument(
      "--eval_data",
      type=str,
      default="",
      help="Path to evaluation data (tf.Example in TFRecord format)")
  parser.add_argument(
      "--classes_file",
      type=str,
      default="",
      help="Path to a file with the classes - one class per line")
  parser.add_argument(
      "--num_layers",
      type=int,
      default=3,
      help="Number of recurrent neural network layers.")
  parser.add_argument(
      "--num_nodes",
      type=int,
      default=128,
      help="Number of node per recurrent network layer.")
  parser.add_argument(
      "--num_conv",
      type=str,
      default="[48, 64, 96]",
      help="Number of conv layers along with number of filters per layer.")
  parser.add_argument(
      "--conv_len",
      type=str,
      default="[5, 5, 3]",
      help="Length of the convolution filters.")
  parser.add_argument(
      "--cell_type",
      type=str,
      default="lstm",
      help="Cell type used for rnn layers: cudnn_lstm, lstm or block_lstm.")
  parser.add_argument(
      "--batch_norm",
      type="bool",
      default="False",
      help="Whether to enable batch normalization or not.")
  parser.add_argument(
      "--learning_rate",
      type=float,
      default=0.0001,
      help="Learning rate used for training.")
  parser.add_argument(
      "--gradient_clipping_norm",
      type=float,
      default=9.0,
      help="Gradient clipping norm used during training.")
  parser.add_argument(
      "--dropout",
      type=float,
      default=0.3,
      help="Dropout used for convolutions and bidi lstm layers.")
  parser.add_argument(
      "--steps",
      type=int,
      default=100000,
      help="Number of training steps.")
  parser.add_argument(
      "--batch_size",
      type=int,
      default=8,
      help="Batch size to use for training/evaluation.")
  parser.add_argument(
      "--model_dir",
      type=str,
      default="",
      help="Path for storing the model checkpoints.")
  parser.add_argument(
      "--self_test",
      type=bool,
      default="False",
      help="Whether to enable batch normalization or not.")
  parser.add_argument(
      "--predict_for_data",
      type=str,
      default="[[[73,66,46,23,12,11,22,48,58,67,70,65],[11,6,2,10,23,33,48,56,54,41,22,10]],[[66,85,71],[9,3,26]],[[24,1,2,8],[6,1,10,19]],[[64,88,134,176,180,184,184,174,111,63,47],[34,29,28,35,39,58,91,94,86,71,62]],[[64,61,62],[74,83,102]],[[83,84,87],[78,102,107]],[[157,159,164],[96,108,116]],[[175,182],[91,115]],[[182,186,198,209,223,234,251,255],[51,36,29,30,38,39,20,8]],[[157,136,128,133,139],[35,47,57,35,29]],[[104,94,84,84,89],[40,52,70,30,26]],[[111,105,105,109,121],[30,59,68,72,34]],[[159,153,153],[41,54,65]]]",
      help=".ndjson single line .drawing (e.g. just the strokes, no labels)")
  parser.add_argument(
      "--predict_temp_file",
      type=str,
      default="./predict_temp.tfrecord",
      help="path to a temporary tfrecord that will be created from the .ndjson drawing data")

  FLAGS, unparsed = parser.parse_known_args()
  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
```
  我已经像这样运行以上内容:
  
  I've ran the above like so:
```
python classify.py --classes_file=rnn_tutorial_data/training.tfrecord.classes --model_dir=model_gpu_all/ --training_data=./rnn_tutorial_data/training.tfrecord-?????-of-????? --eval_data=./rnn_tutorial_data/eval.tfrecord-?????-of-????? --predict_for_data="[[[73,66,46,23,12,11,22,48,58,67,70,65],[11,6,2,10,23,33,48,56,54,41,22,10]],[[66,85,71],[9,3,26]],[[24,1,2,8],[6,1,10,19]],[[64,88,134,176,180,184,184,174,111,63,47],[34,29,28,35,39,58,91,94,86,71,62]],[[64,61,62],[74,83,102]],[[83,84,87],[78,102,107]],[[157,159,164],[96,108,116]],[[175,182],[91,115]],[[182,186,198,209,223,234,251,255],[51,36,29,30,38,39,20,8]],[[157,136,128,133,139],[35,47,57,35,29]],[[104,94,84,84,89],[40,52,70,30,26]],[[111,105,105,109,121],[30,59,68,72,34]],[[159,153,153],[41,54,65]]]" --predict_temp_file=./predict_temp.tfrecord --cell_type=cudnn_lstm
```
  最后得到一个预测:
```
Predicted Class is: cow with a probability of 0.533384
```
  不是一个很好的例子(教程会警告数据集的大小和准确性)，但这只是一个预测，是的！在此示例中，完全执行耗时31秒.
  
  Not a great one (tutorial warns on dataset size and accuracy), but it's a prediction, yaaay! Full execution took 31 seconds with this example.
  
  推荐答案
  
  python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
  
  python train_model.py \ --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ --classes_file=rnn_tutorial_data/training.tfrecord.classes
  
  使用上述命令的AFAIK也可以正常工作，它将简单地读取您一次下载数据文件的文件夹中的所有文件.
  
  AFAIK using the above command works as well, it will simply read all the files in the folder where you have download the data files one by one.
  
  create_tfrecord_for_prediction当然不是我自己创建的，该代码主要是从tensorflow的其他文件中提取的create_dataset.py
  
  create_tfrecord_for_prediction is certainly not my own creation, this code was mostly picked from another file from tensorflow guys create_dataset.py
  
  下面，我粘贴了我添加的几乎所有新代码，包括对main()函数的修改
  
  Below I have pasted the almost all of the new code i added including the my modifications to the main() function
```
def create_tfrecord_for_prediction(batch_size, stoke_data, tfrecord_file):
    def parse_line(stoke_data):
        """Parse provided stroke data and ink (as np array) and classname."""
        inkarray = json.loads(stoke_data)
        stroke_lengths = [len(stroke[0]) for stroke in inkarray]
        total_points = sum(stroke_lengths)
        np_ink = np.zeros((total_points, 3), dtype=np.float32)
        current_t = 0
        for stroke in inkarray:
            if len(stroke[0]) != len(stroke[1]):
                print("Inconsistent number of x and y coordinates.")
                return None
            for i in [0, 1]:
                np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i]
            current_t += len(stroke[0])
            np_ink[current_t - 1, 2] = 1  # stroke_end
        # Preprocessing.
        # 1. Size normalization.
        lower = np.min(np_ink[:, 0:2], axis=0)
        upper = np.max(np_ink[:, 0:2], axis=0)
        scale = upper - lower
        scale[scale == 0] = 1
        np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale
        # 2. Compute deltas.
        #np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2]
        #np_ink = np_ink[1:, :]
        np_ink[1:, 0:2] -= np_ink[0:-1, 0:2]
        np_ink = np_ink[1:, :]

        features = {}
        features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten()))
        features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape))
        f = tf.train.Features(feature=features)
        ex = tf.train.Example(features=f)
        return ex

    if stoke_data is None:
        print("Error: Stroke data cannot be none")
        return

    example = parse_line(stoke_data)

    #Remove the file if it already exists
    if tf.gfile.Exists(tfrecord_file):
        tf.gfile.Remove(tfrecord_file)

    writer = tf.python_io.TFRecordWriter(tfrecord_file)
    for i in range(batch_size):
        writer.write(example.SerializeToString())
    writer.flush()
    writer.close()

def get_classes():
  classes = []
  with tf.gfile.GFile(FLAGS.classes_file, "r") as f:
    classes = [x.rstrip() for x in f]
  return classes

def main(unused_args):
  print("%s: I Starting application" % (datetime.now()))

  estimator, train_spec, eval_spec = create_estimator_and_specs(
      run_config=tf.estimator.RunConfig(
          model_dir=FLAGS.model_dir,
          save_checkpoints_secs=300,
          save_summary_steps=100))
  tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

  if FLAGS.predict_for_data != None:
      print("%s: I Starting prediction" % (datetime.now()))
      class_names = get_classes()
      create_tfrecord_for_prediction(FLAGS.batch_size, FLAGS.predict_for_data, FLAGS.predict_temp_file)
      predict_results = estimator.predict(input_fn=get_input_fn(
          mode=tf.estimator.ModeKeys.PREDICT,
          tfrecord_pattern=FLAGS.predict_temp_file,
          batch_size=FLAGS.batch_size))

      #predict_results = estimator.predict(input_fn=predict_input_fn)
      for idx, prediction in enumerate(predict_results):
          index = prediction["class_index"]  # Get the predicted class (index)
          probability = prediction["probabilities"][index]
          class_name = class_names[index]
          print("%s: Predicted Class is: %s with a probability of %f" % (datetime.now(), class_name, probability))
          break #We care for only the first prediction, rest are all duplicates just to meet the batch size
```
  FLAGS.predict_for_data这是保存笔触数据的命令行参数 FLAGS.predict_temp_file只是我用来创建临时输入数据tfrecord文件的文件名
  
  FLAGS.predict_for_data this is the command line argument that holds the strokes data FLAGS.predict_temp_file is just a name of file i use to create the temporary input data tfrecord file
  注意1:与此同时，我还修改了get_input_fn()中的一些代码，您可以在此PR中找到此代码更改: https://github.com/tensorflow/models/pull/3440 (尚未合并)
  
  Note1 : Along with this I also modified some code in get_input_fn() you can find this code change in this PR: https://github.com/tensorflow/models/pull/3440 (has not been merged yet)
  
  注2:我还必须修改model_fn()并在注释#Compute current predictions
  
  Note2: I also had to modify model_fn() and add the below few lines my additions are after the comment #Compute current predictions
```
  # Build the model.
  inks, lengths, labels = _get_input_tensors(features, labels)
  convolved, lengths = _add_conv_layers(inks, lengths)
  final_state = _add_rnn_layers(convolved, lengths)
  logits = _add_fc_layers(final_state)

  # Compute current predictions.
  predictions = tf.argmax(logits, axis=1)

  if mode == tf.estimator.ModeKeys.PREDICT:
      preds = {
          "class_index": predictions,
          #"class_index": predictions[:, tf.newaxis],
          "probabilities": tf.nn.softmax(logits),
          "logits": logits
      }
      #preds = {"logits": logits, "predictions": predictions}

      return tf.estimator.EstimatorSpec(mode, predictions=preds)
      # Add the loss.
  cross_entropy = tf.reduce_mean(
      tf.nn.sparse_softmax_cross_entropy_with_logits(
          labels=labels, logits=logits))

  # Add the optimizer.
  train_op = tf.contrib.layers.optimize_loss(
      loss=cross_entropy,
      global_step=tf.train.get_global_step(),
      learning_rate=params.learning_rate,
      optimizer="Adam",
      # some gradient clipping stabilizes training in the beginning.
      clip_gradients=params.gradient_clipping_norm,
      summaries=["learning_rate", "loss", "gradients", "gradient_norm"])

  return tf.estimator.EstimatorSpec(
      mode=mode,
      predictions={"logits": logits, "predictions": predictions},
      loss=cross_entropy,
      train_op=train_op,
      eval_metric_ops={"accuracy": tf.metrics.accuracy(labels, predictions)})
```
  然后剩下的唯一事情就是弄清楚生成笔画数据.为此，您可以读取现有的tfrecord文件之一，然后从该读取操作中提取笔画，也可以编写一些javascript网页来生成笔画
  
  The only thing left is then to figure out generating strokes data. For this you can take one of the existing tfrecord file read it and extract a stroke from that read operation or you could write some javascript webpage to generate strokes
  
  这篇关于如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

推荐答案

如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类? [英] How to classify a QuickDraw doodle using TensorFlow's sketch RNN tutorial?

问题描述

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类? [英] How to classify a QuickDraw doodle using TensorFlow&#39;s sketch RNN tutorial?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

如何使用TensorFlow的素描RNN教程对QuickDraw涂鸦进行分类? [英] How to classify a QuickDraw doodle using TensorFlow's sketch RNN tutorial?

登录关闭