pandas :从出现超过X次的列中获取值 [英] Pandas: Get values from column that appear more than X times

查看:67
本文介绍了 pandas :从出现超过X次的列中获取值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在pandas中有一个数据框,并且想获取某列中所有出现超过X次的值.我知道这应该很容易,但是以某种方式我目前的尝试并没有达到目的.

I have a data frame in pandas and would like to get all the values of a certain column that appear more than X times. I know this should be easy but somehow I am not getting anywhere with my current attempts.

这里是一个例子:

>>> df2 = pd.DataFrame([{"uid": 0, "mi":1}, {"uid": 0, "mi":2}, {"uid": 0, "mi":1}, {"uid": 0, "mi":1}])
>>> df2

    mi  uid
0    1   0
1    2   0
2    1   0
3    1   0

现在假设我想从出现在"mi"列中的所有值出现两次以上,那么结果应该是

Now supposed I want to get all values from column "mi" that appear more than 2 times, the result should be

>>> <fancy query>
array([1])

我已经用groupby和count尝试了几件事,但是我总是最终得到一个包含值和它们各自的计数的序列,但是不知道如何从中提取计数超过X的值.

I have tried a couple of things with groupby and count but I always end up with a series with the values and their respective counts but don't know how to extract the values that have count more than X from that:

>>> df2.groupby('mi').mi.count() > 2
mi
1      True
2     False
dtype: bool

但是我现在该如何使用它来获得mi的值呢?

But how can I use this now to get the values of mi that are true?

任何提示表示赞赏:)

推荐答案

还是这样:

创建表:

>>> import pandas as pd
>>> df2 = pd.DataFrame([{"uid": 0, "mi":1}, {"uid": 0, "mi":2}, {"uid": 0, "mi":1}, {"uid": 0, "mi":1}])

获取每次事件的计数:

>>> vc = df2.mi.value_counts()
>>> print vc
1    3
2    1

打印出出现两次以上的内容:

Print out those that occur more than 2 times:

>>> print vc[vc > 2].index[0]
1

这篇关于 pandas :从出现超过X次的列中获取值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆