来自张量流估计器 RNNClassifier 的 ValueError 与 gcloud ml-engine 作业 [英] ValueError from tensorflow estimator RNNClassifier with gcloud ml-engine job

查看:65
本文介绍了来自张量流估计器 RNNClassifier 的 ValueError 与 gcloud ml-engine 作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理 task.py 文件以提交 gcloud MLEngine 作业.以前,我使用 tensorflow.estimator.DNNClassifier 成功地提交了包含我的数据的作业(仅包含 8 列用于加密货币价格和数量的连续数字数据;没有分类).

I am working on the task.py file for submitting a gcloud MLEngine job. Previously I was using tensorflow.estimator.DNNClassifier successfully to submit jobs with my data (which consists solely of 8 columns of sequential numerical data for cryptocurrency prices & volume; no categorical).

我现在已经切换到 tensorflow contrib estimator RNNClassifier.这是我当前相关部分的代码:

I have now switched to the tensorflow contrib estimator RNNClassifier. This is my current code for the relevant portion:

def get_feature_columns():
  return [
      tf.feature_column.numeric_column(feature, shape=(1,))
      for feature in column_names[:len(column_names)-1]
  ]

def build_estimator(config, learning_rate, num_units):
  return tf.contrib.estimator.RNNClassifier(
    sequence_feature_columns=get_feature_columns(),
    num_units=num_units,
    cell_type='lstm',
    rnn_cell_fn=None,
    optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate),
    config=config)

estimator = build_estimator(
    config=run_config,
    learning_rate=args.learning_rate,
    num_units=[32, 16])

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

但是,我收到以下 ValueError:

However, I'm getting the following ValueError:

ValueError: All feature_columns must be of type _SequenceDenseColumn. You can wrap a sequence_categorical_column with an embedding_column or indicator_column. Given (type <class 'tensorflow.python.feature_column.feature_column_v2.NumericColumn'>): NumericColumn(key='LTCUSD_close', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)

我不明白这一点,因为数据不是分类的.

I don't understand this, as the data is not categorical.

推荐答案

正如@Ben7 指出的那样 sequence_feature_columns 接受像 sequence_numeric_column.但是,根据文档,RNNClassifier sequence_feature_columns 期望SparseTensors 和 sequence_numeric_column 是一个密集的张量.这似乎是矛盾的.

As @Ben7 pointed out sequence_feature_columns accepts columns like sequence_numeric_column. However, according to the documentation, RNNClassifier sequence_feature_columns expects SparseTensors and sequence_numeric_column is a dense tensor. This seems to be contradictory.

这是我用来解决此问题的解决方法(我从 this answer 中获取了 to_sparse_tensor 函数)::>

Here is a workaround I used to solve this issue (I took the to_sparse_tensor function from this answer):

def to_sparse_tensor(dense):

    # sequence_numeric_column default is float32
    zero = tf.constant(0.0, dtype=tf.dtypes.float32) 

    where = tf.not_equal(dense, zero)
    indices = tf.where(where)
    values = tf.gather_nd(dense, indices)

    return tf.SparseTensor(indices, values, tf.shape(dense, out_type=tf.dtypes.int64))

def get_feature_columns():
  return [
      tf.feature_column.sequence_numeric_column(feature, shape=(1,), normalizer_fn=to_sparse_tensor)
      for feature in column_names[:len(column_names)-1]
  ]

这篇关于来自张量流估计器 RNNClassifier 的 ValueError 与 gcloud ml-engine 作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆