提升一个类别 [英] Upweight a Category
问题描述
我构建了一个 TensorFlow 模型,该模型使用 DNNClassifier
将输入分为两类.
I have built a TensorFlow model that uses a DNNClassifier
to classify input into two categories.
我的问题是结果 1 出现的概率高达 90-95%.因此,TensorFlow 为我的所有预测提供了相同的概率.
My problem is that Outcome 1 occurs upwards of 90-95% of the time. Therefore, TensorFlow is giving me the same probabilities for all of my predictions.
我正在尝试预测其他结果(例如,结果 2 的假阳性比错过可能发生的结果 2 更可取).我知道在一般的机器学习中,在这种情况下,尝试提升结果 2 是值得的.
I am trying to predict the other outcome (e.g. having a false positive for Outcome 2 is preferable to missing a possible occurrence of Outcome 2). I know that in machine learning in general, in this case it would be worthwhile to try to upweight Outcome 2.
但是,我不知道如何在 TensorFlow 中执行此操作.文档 暗示这是可能的,但我可以'没有找到任何关于它实际外观的例子.有没有人成功地做到了这一点,或者有没有人知道我在哪里可以找到一些示例代码或详尽的解释(我使用的是 Python)?
However, I don't know how to do this in TensorFlow. The documentation alludes to it being possible, but I can't find any examples of what it would actually look like. Has anyone has successfully done this, or does anyone know where I could find some example code or a thorough explanation (I'm using Python)?
注意:当有人使用 TensorFlow 的更基本部分而不是估计器时,我已经看到公开的权重被操纵.出于维护原因,我需要使用估算器来执行此操作.
Note: I have seen exposed weights being manipulated when someone is using the more fundamental parts of TensorFlow and not an estimator. For maintenance reasons, I need to do this using an estimator.
推荐答案
tf.estimator.DNNClassifier
构造函数有 weight_column
参数:
weight_column
:由创建的字符串或_NumericColumn
tf.feature_column.numeric_column
定义特征列表示重量.它用于在训练期间减轻重量或增加示例.它将乘以示例的损失.如果是字符串,它用作从 features
获取权重张量的键.如果是一个 _NumericColumn
,原始张量由键 weight_column.key
获取,然后weight_column.normalizer_fn
应用于它以获得权重张量.
weight_column
: A string or a_NumericColumn
created bytf.feature_column.numeric_column
defining feature column representing weights. It is used to down weight or boost examples during training. It will be multiplied by the loss of the example. If it is a string, it is used as a key to fetch weight tensor from thefeatures
. If it is a_NumericColumn
, raw tensor is fetched by keyweight_column.key
, thenweight_column.normalizer_fn
is applied on it to get weight tensor.
所以只需添加一个新列并为稀有类填充一些权重:
So just add a new column and fill it with some weight for the rare class:
weight = tf.feature_column.numeric_column('weight')
...
tf.estimator.DNNClassifier(..., weight_column=weight)
[更新] 这是一个完整的工作示例:
[Update] Here's a complete working example:
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('mnist', one_hot=False)
train_x, train_y = mnist.train.next_batch(1024)
test_x, test_y = mnist.test.images, mnist.test.labels
x_column = tf.feature_column.numeric_column('x', shape=[784])
weight_column = tf.feature_column.numeric_column('weight')
classifier = tf.estimator.DNNClassifier(feature_columns=[x_column],
hidden_units=[100, 100],
weight_column=weight_column,
n_classes=10)
# Training
train_input_fn = tf.estimator.inputs.numpy_input_fn(x={'x': train_x, 'weight': np.ones(train_x.shape[0])},
y=train_y.astype(np.int32),
num_epochs=None, shuffle=True)
classifier.train(input_fn=train_input_fn, steps=1000)
# Testing
test_input_fn = tf.estimator.inputs.numpy_input_fn(x={'x': test_x, 'weight': np.ones(test_x.shape[0])},
y=test_y.astype(np.int32),
num_epochs=1, shuffle=False)
acc = classifier.evaluate(input_fn=test_input_fn)
print('Test Accuracy: %.3f' % acc['accuracy'])
这篇关于提升一个类别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!