pandas-计算每个列中每个唯一值在DataFrame中出现的值 [英] pandas - Counting occurrences of a value in a DataFrame per each unique value in another column

查看:814
本文介绍了pandas-计算每个列中每个唯一值在DataFrame中出现的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个DataFrame:

Supposing that I have a DataFrame along the lines of:

    term      score
0   this          0
1   that          1
2   the other     3
3   something     2
4   anything      1
5   the other     2
6   that          2
7   this          0
8   something     1

如何通过term列中的唯一值来计数score列中的实例?产生如下结果:

How would I go about counting up the instances in the score column by unique values in the term column? Producing a result like:

    term      score 0     score 1     score 2     score 3
0   this            2           0           0           0
1   that            0           1           1           0
2   the other       0           0           1           1
3   something       0           1           1           0
4   anything        0           1           0           0

我在这里阅读过的相关问题包括 Python熊猫对特定条件进行计数和求和在多列中的python熊猫中的COUNTIF具有多个条件,但似乎都不是我想要做的.如中所述的pivot_table这个问题似乎很有意义,但由于缺乏经验和熊猫文档的简洁性,我受到了阻碍.感谢您的任何建议.

Related questions I've read here include Python Pandas counting and summing specific conditions and COUNTIF in pandas python over multiple columns with multiple conditions, but neither seems to quite be what I'm looking to do. pivot_table as mentioned at this question seems like it could be relevant but I'm impeded by lack of experience and the brevity of the pandas documentation. Thanks for any suggestions.

推荐答案

使用 groupby size 并通过 unstack ,最后一个 add_prefix :

Use groupby with size and reshape by unstack, last add_prefix:

df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')

或使用 crosstab :

df = pd.crosstab(df['term'],df['score']).add_prefix('score ')

pivot_table :

df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
        .add_prefix('score '))


print (df)
score      score 0  score 1  score 2  score 3
term                                         
anything         0        1        0        0
something        0        1        1        0
that             0        1        1        0
the other        0        0        1        1
this             2        0        0        0

这篇关于pandas-计算每个列中每个唯一值在DataFrame中出现的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆