按两列分组,并计算每种组合在 pandas 中出现的次数 [英] Group by two columns and count the occurrences of each combination in pandas

查看:162
本文介绍了按两列分组,并计算每种组合在 pandas 中出现的次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框:

data = pd.DataFrame({'user_id' : ['a1', 'a1', 'a1', 'a2','a2','a2','a3','a3','a3'], 'product_id' : ['p1','p1','p2','p1','p1','p1','p2','p2','p3']})

product_id  user_id
    p1       a1
    p1       a1
    p2       a1
    p1       a2
    p1       a2
    p1       a2
    p2       a3
    p2       a3
    p3       a3

在实际情况下,可能还会有其他一些列,但是我需要做的是按product_id和user_id列对数据帧进行分组,并对每种组合的数量进行计数,并将其添加为新的dat帧中的新列

in real case there might be some other columns as well, but what i need to do is to group by data frame by product_id and user_id columns and count number of each combination and add it as a new column in a new dat frame

输出应该是这样的:

user_id product_id  count
a1       p1            2
a1       p2            1
a2       p1            3
a3       p2            2
a3       p3            1

我尝试了以下代码:

grouped=data.groupby(['user_id','product_id']).count()

但是结果是:

user_id product_id
 a1       p1
          p2
 a2       p1
 a3       p2
          p3

实际上,对我来说最重要的事情是让具有发生次数的列名计数,我以后需要使用该列.

actually the most important thing for me is to have a column names count that has the number of occurrences , i need to use the column later.

推荐答案

也许这就是您想要的?

>>> data = pd.DataFrame({'user_id' : ['a1', 'a1', 'a1', 'a2','a2','a2','a3','a3','a3'], 'product_id' : ['p1','p1','p2','p1','p1','p1','p2','p2','p3']})
>>> count_series = data.groupby(['user_id', 'product_id']).size()
>>> count_series
user_id  product_id
a1       p1            2
         p2            1
a2       p1            3
a3       p2            2
         p3            1
dtype: int64
>>> new_df = count_series.to_frame(name = 'size').reset_index()
>>> new_df
  user_id product_id  size
0      a1         p1     2
1      a1         p2     1
2      a2         p1     3
3      a3         p2     2
4      a3         p3     1
>>> new_df['size']
0    2
1    1
2    3
3    2
4    1
Name: size, dtype: int64

这篇关于按两列分组,并计算每种组合在 pandas 中出现的次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆