来自张量流估计器 RNNClassifier 的 ValueError 与 gcloud ml-engine 作业 [英] ValueError from tensorflow estimator RNNClassifier with gcloud ml-engine job
问题描述
我正在处理 task.py 文件以提交 gcloud MLEngine 作业.以前,我使用 tensorflow.estimator.DNNClassifier 成功地提交了包含我的数据的作业(仅包含 8 列用于加密货币价格和数量的连续数字数据;没有分类).
I am working on the task.py file for submitting a gcloud MLEngine job. Previously I was using tensorflow.estimator.DNNClassifier successfully to submit jobs with my data (which consists solely of 8 columns of sequential numerical data for cryptocurrency prices & volume; no categorical).
我现在已经切换到 tensorflow contrib estimator RNNClassifier.这是我当前相关部分的代码:
I have now switched to the tensorflow contrib estimator RNNClassifier. This is my current code for the relevant portion:
def get_feature_columns():
return [
tf.feature_column.numeric_column(feature, shape=(1,))
for feature in column_names[:len(column_names)-1]
]
def build_estimator(config, learning_rate, num_units):
return tf.contrib.estimator.RNNClassifier(
sequence_feature_columns=get_feature_columns(),
num_units=num_units,
cell_type='lstm',
rnn_cell_fn=None,
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate),
config=config)
estimator = build_estimator(
config=run_config,
learning_rate=args.learning_rate,
num_units=[32, 16])
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
但是,我收到以下 ValueError:
However, I'm getting the following ValueError:
ValueError: All feature_columns must be of type _SequenceDenseColumn. You can wrap a sequence_categorical_column with an embedding_column or indicator_column. Given (type <class 'tensorflow.python.feature_column.feature_column_v2.NumericColumn'>): NumericColumn(key='LTCUSD_close', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)
我不明白这一点,因为数据不是分类的.
I don't understand this, as the data is not categorical.
推荐答案
正如@Ben7 指出的那样 sequence_feature_columns 接受像 sequence_numeric_column.但是,根据文档,RNNClassifier sequence_feature_columns 期望SparseTensors 和 sequence_numeric_column 是一个密集的张量.这似乎是矛盾的.
As @Ben7 pointed out sequence_feature_columns accepts columns like sequence_numeric_column. However, according to the documentation, RNNClassifier sequence_feature_columns expects SparseTensors and sequence_numeric_column is a dense tensor. This seems to be contradictory.
这是我用来解决此问题的解决方法(我从 this answer 中获取了 to_sparse_tensor 函数)::>
Here is a workaround I used to solve this issue (I took the to_sparse_tensor function from this answer):
def to_sparse_tensor(dense):
# sequence_numeric_column default is float32
zero = tf.constant(0.0, dtype=tf.dtypes.float32)
where = tf.not_equal(dense, zero)
indices = tf.where(where)
values = tf.gather_nd(dense, indices)
return tf.SparseTensor(indices, values, tf.shape(dense, out_type=tf.dtypes.int64))
def get_feature_columns():
return [
tf.feature_column.sequence_numeric_column(feature, shape=(1,), normalizer_fn=to_sparse_tensor)
for feature in column_names[:len(column_names)-1]
]
这篇关于来自张量流估计器 RNNClassifier 的 ValueError 与 gcloud ml-engine 作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!