一列内唯一对的数量- pandas [英] Number of unique pairs within one column - pandas
问题描述
在为我的熊猫数据框生成统计信息时,我遇到了一些问题.我的数据框如下所示(我省略了索引):
I am having a little problem with producing statistics for my dataframe in pandas. My dataframe looks like this (I omit the index):
id type
1 A
2 B
3 A
1 B
3 B
2 C
4 B
4 C
重要的是,每个id
都分配了两个type
值,如上例所示.我想计算所有type
组合的出现次数(因此,用给定的type
组合来计算唯一id
的数量),所以我想获得这样的数据框:
What is important, each id
has two type
values assigned, as can be seen from the example above. I want to count all type
combinations occurrences (so count number of unique id
with given type
combination), so I want to get such a dataframe:
type count
A, B 2
A, C 0
B, C 2
我尝试了多种方式使用groupby
,但徒劳无功.我可以使用for-loop
和许多行代码来进行这种计数",但是我认为必须有一个优雅且适当的解决方案(以python术语).
I tried using groupby
in many ways, but in vain. I can do this kind of 'count' using for-loop
and a number of lines of code, but I believe there has to be elegant and proper (in python terms) solution to this problem.
预先感谢您的提示.
推荐答案
将GroupBy
+ apply
与value_counts
一起使用:
from itertools import combinations
def combs(types):
return pd.Series(list(combinations(sorted(types), 2)))
res = df.groupby('id')['type'].apply(combs).value_counts()
print(res)
(A, B) 2
(B, C) 2
Name: type, dtype: int64
这篇关于一列内唯一对的数量- pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!