计算 pandas 数据框中每一行的百分比 [英] Compute percentage for each row in pandas dataframe
本文介绍了计算 pandas 数据框中每一行的百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
country_name country_code val_code \
United States of America 231 1
United States of America 231 2
United States of America 231 3
United States of America 231 4
United States of America 231 5
y191 y192 y193 y194 y195 \
47052179 43361966 42736682 43196916 41751928
1187385 1201557 1172941 1176366 1192173
28211467 27668273 29742374 27543836 28104317
179000 193000 233338 276639 249688
12613922 12864425 13240395 14106139 15642337
在上面的数据框中,我想为每一行计算该val_code所占总数的百分比,结果为foll.数据框.
In the data frame above, I would like to compute for each row, the percentage of the total occupied by that val_code, resulting in foll. data frame.
即总结每一行,然后除以所有行的总数
I.e. Sum up each row and divide by total of all rows
country_name country_code val_code \
United States of America 231 1
United States of America 231 2
United States of America 231 3
United States of America 231 4
United States of America 231 5
perc
50.14947129
1.363631254
32.48344744
0.260213146
15.74323688
现在,我正在执行此操作,但是它不起作用
Right now, I am doing this, but it is not working
grp_df = df.groupby(['country_name', 'val_code']).agg()
pct_df = grp_df.groupby(level=0).apply(lambda x: 100*x/float(x.sum()))
推荐答案
为所有感兴趣的列求和,然后添加百分比列:
Ge the total for all the columns of interest and then add the percentage column:
In [35]:
total = np.sum(df.ix[:,'y191':].values)
df['percent'] = df.ix[:,'y191':].sum(axis=1)/total * 100
df
Out[35]:
country_name country_code val_code y191 y192 \
0 United States of America 231 1 47052179 43361966
1 United States of America 231 1 1187385 1201557
2 United States of America 231 1 28211467 27668273
3 United States of America 231 1 179000 193000
4 United States of America 231 1 12613922 12864425
y193 y194 y195 percent
0 42736682 43196916 41751928 50.149471
1 1172941 1176366 1192173 1.363631
2 29742374 27543836 28104317 32.483447
3 233338 276639 249688 0.260213
4 13240395 14106139 15642337 15.743237
所以np.sum
将所有值求和:
In [32]:
total = np.sum(df.ix[:,'y191':].values)
total
Out[32]:
434899243
然后我们在感兴趣的列上调用.sum(axis=1)/total * 100
进行逐行求和,除以总数并乘以100得到一个百分比.
We then call .sum(axis=1)/total * 100
on the cols of interest to sum row-wise, divide by the total and multiply by 100 to get a percentage.
这篇关于计算 pandas 数据框中每一行的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文