pandas 中的标签平滑(软目标) [英] Label smoothing (soft targets) in Pandas
问题描述
在Pandas中,有 get_dummies
方法一键编码分类变量.现在,我要按照深度学习书的第7.5.1节中所述进行标签平滑:
In Pandas there is get_dummies
method that one-hot encodes categorical variable. Now I want to do label smoothing as described in section 7.5.1 of Deep Learning book:
标签平滑通过将硬 0 和 1 分类目标替换为<
eps / k
和1 - (k - 1) / k * eps
.
Label smoothing regularizes a model based on a softmax with k output values by replacing the hard 0 and 1 classification targets with targets of
eps / k
and1 - (k - 1) / k * eps
, respectively.
在Pandas数据框中进行标签平滑处理的最有效和/或最优雅的方法是什么?
What would be the most efficient and/or elegant way to do label smothing in Pandas dataframe?
推荐答案
首先,让我们使用更简单的方程式(ϵ
表示您从真实标签"移动并分配给所有剩余标签的概率质量).>
First, lets use much simpler equation (ϵ
denotes how much probability mass you move from "true label" and distribute to all remaining ones).
1 -> 1 - ϵ
0 -> ϵ / (k-1)
您只需使用上面的数学特性即可,因为您所要做的就是
You can simply use nice mathematical property of the above, since all you have to do is
x -> x * (1 - ϵ) + (1-x) * ϵ / (k-1)
因此,如果您的虚拟列是a, b, c, d
,那么就做
thus if your dummy columns are a, b, c, d
just do
indices = ['a', 'b', 'c', 'd']
eps = 0.1
df[indices] = df[indices] * (1 - eps) + (1-df[indices]) * eps / (len(indices) - 1)
为此
>>> df
a b c d
0 1 0 0 0
1 0 1 0 0
2 0 0 0 1
3 1 0 0 0
4 0 1 0 0
5 0 0 1 0
返回
a b c d
0 0.900000 0.033333 0.033333 0.033333
1 0.033333 0.900000 0.033333 0.033333
2 0.033333 0.033333 0.033333 0.900000
3 0.900000 0.033333 0.033333 0.033333
4 0.033333 0.900000 0.033333 0.033333
5 0.033333 0.033333 0.900000 0.033333
符合预期.
这篇关于 pandas 中的标签平滑(软目标)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!