pandas 计入多列 [英] pandas count over multiple columns

查看：90 发布时间：2020/6/17 19:04:01 python pandas graphlab

本文介绍了 pandas 计入多列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个像这样的数据框

I have a dataframe looking like this

Measure1 Measure2 Measure3 ...
0        1         3
1        3         2
3        0

我想计算要产生的列上值的出现次数:

I'd like to count the occurrences of the values over the columns to produce:

Measure Count Percentage
0       2     0.25
1       2     0.25
2       1     0.125
3       3     0.373

使用

outcome_measure_count = cdss_data.groupby(key_columns=['Measure1'],operations={'count': agg.COUNT()}).sort('count', ascending=True)

我只得到第一列(实际上是使用graphlab程序包，但我更喜欢熊猫)

I only get the first column (actually using graphlab package, but I'd prefer pandas)

有人可以帮我吗?

推荐答案

您可以通过使用ravel和value_counts展平df来生成计数，从而可以构建最终的df:

You can generate the counts by flattening the df using ravel and value_counts, from this you can construct the final df:

In [230]:
import io
import pandas as pd

t="""Measure1 Measure2 Measure3
0        1         3
1        3         2
3        0        0"""

df = pd.read_csv(io.StringIO(t), sep='\s+')
df

Out[230]:
   Measure1  Measure2  Measure3
0         0         1         3
1         1         3         2
2         3         0         0

In [240]:    
count = pd.Series(df.squeeze().values.ravel()).value_counts()
pd.DataFrame({'Measure': count.index, 'Count':count.values, 'Percentage':(count/count.sum()).values})

Out[240]:
   Count  Measure  Percentage
0      3        3    0.333333
1      3        0    0.333333
2      2        1    0.222222
3      1        2    0.111111

我插入了0只是为了使df形状正确，但是您应该明白了这一点

I inserted a 0 just to make the df shape correct but you should get the point

这篇关于 pandas 计入多列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 计入多列 [英] pandas count over multiple columns

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 计入多列 [英] pandas count over multiple columns

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭