计算pandas DataFrame中的行百分比? [英] Compute row percentages in pandas DataFrame?
问题描述
我将数据存储在pandas DataFrame中,如下所示:
I have my data in a pandas DataFrame, and it looks like the following:
cat val1 val2 val3 val4
A 7 10 0 19
B 10 2 1 14
C 5 15 6 16
我想计算每个值具有的类别( cat
)的百分比。
I'd like to compute the percentage of the category (cat
) that each value has.
例如,对于类别 A
, val1
为7,行总数为36。结果值为7/36,因此 val1
是类别 A
。
For example, for category A
, val1
is 7 and the row total is 36. The resulting value would be 7/36, so val1
is 19.4% of category A
.
我的预期结果如下所示:
My expected result would look like the folowing:
cat val1 val2 val3 val4
A .194 .278 .0 .528
B .370 .074 .037 .519
C .119 .357 .143 .381
有一种简单的方法可以计算出来吗?
Is there an easy way to compute this?
推荐答案
div +总和
对于矢量化解决方案,沿 axis = 0
划分数据帧通过其在 axis = 1
上的总和。您可以使用 set_index
+ reset_index
忽略标识符列。
div + sum
For a vectorised solution, divide the dataframe along axis=0
by its sum over axis=1
. You can use set_index
+ reset_index
to ignore the identifier column.
df = df.set_index('cat')
res = df.div(df.sum(axis=1), axis=0)
print(res.reset_index())
cat val1 val2 val3 val4
0 A 0.194444 0.277778 0.000000 0.527778
1 B 0.370370 0.074074 0.037037 0.518519
2 C 0.119048 0.357143 0.142857 0.380952
这篇关于计算pandas DataFrame中的行百分比?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!