Tensorflow一个热编码器? [英] Tensorflow One Hot Encoder?

查看：78 发布时间：2020/5/4 8:54:30 python machine-learning neural-network tensorflow

本文介绍了Tensorflow一个热编码器?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

张量流是否具有类似于scikit Learn的一个热编码器来进行处理分类数据?使用tf.string的占位符会表现为分类数据吗?

Does tensorflow have something similar to scikit learn's one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data?

我意识到我可以在将数据发送到tensorflow之前对其进行手动预处理，但是将其内置非常方便.

I realize I can manually pre-process the data before sending it to tensorflow, but having it built in is very convenient.

推荐答案

从TensorFlow 0.8开始，现在有一个原生一键运算tf.one_hot ，可以将一组稀疏标签转换为密集的一键运算.这是对 tf.nn.sparse_softmax_cross_entropy_with_logits 的补充.让您直接在稀疏标签上计算交叉熵，而不是将它们转换为一键热.

As of TensorFlow 0.8, there is now a native one-hot op, tf.one_hot that can convert a set of sparse labels to a dense one-hot representation. This is in addition to tf.nn.sparse_softmax_cross_entropy_with_logits, which can in some cases let you compute the cross entropy directly on the sparse labels instead of converting them to one-hot.

上一个答案，以防您想采用旧方法: @Salvador的答案是正确的-过去(过去)没有本机操作.不过，您可以使用稀疏到密集运算符在tensorflow中本地执行此操作，而不是在numpy中执行此操作:

Previous answer, in case you want to do it the old way: @Salvador's answer is correct - there (used to be) no native op to do it. Instead of doing it in numpy, though, you can do it natively in tensorflow using the sparse-to-dense operators:

num_labels = 10

# label_batch is a tensor of numeric labels to process
# 0 <= label < num_labels

sparse_labels = tf.reshape(label_batch, [-1, 1])
derived_size = tf.shape(label_batch)[0]
indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
concated = tf.concat(1, [indices, sparse_labels])
outshape = tf.pack([derived_size, num_labels])
labels = tf.sparse_to_dense(concated, outshape, 1.0, 0.0)

标签的输出是一批次矩阵x数量为num_labels的矩阵.

The output, labels, is a one-hot matrix of batch_size x num_labels.

还请注意，自2016年2月12日起(我认为最终将成为0.7发行版的一部分)，TensorFlow还具有tf.nn.sparse_softmax_cross_entropy_with_logits op，在某些情况下，它可以让您进行培训而无需转换为一键式编码.

Note also that as of 2016-02-12 (which I assume will eventually be part of a 0.7 release), TensorFlow also has the tf.nn.sparse_softmax_cross_entropy_with_logits op, which in some cases can let you do training without needing to convert to a one-hot encoding.

编辑后添加:最后，您可能需要显式设置标签的形状.形状推断无法识别num_labels组件的大小.如果您不需要动态的批量大小，则可以简化操作.

Edited to add: At the end, you may need to explicitly set the shape of labels. The shape inference doesn't recognize the size of the num_labels component. If you don't need a dynamic batch size with derived_size, this can be simplified.

修改了2016年2月12日，以更改以下每个注释的形状分配.

Edited 2016-02-12 to change the assignment of outshape per comment below.

这篇关于Tensorflow一个热编码器?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tensorflow一个热编码器? [英] Tensorflow One Hot Encoder?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

Tensorflow一个热编码器? [英] Tensorflow One Hot Encoder?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭