计数python中值的重新出现 [英] Count Re-occurrence of a value in python

查看:72
本文介绍了计数python中值的重新出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中包含以下内容:

I have a data set which contains something like this:

SNo  Cookie
1       A
2       A
3       A
4       B
5       C
6       D
7       A
8       B
9       D
10      E
11      D
12      A

所以可以说我们有5个Cookie,'A,B,C,D,E'.现在,我要计算遇到新的cookie后是否再次发生任何cookie.例如,在上面的示例中,cookie A 在第7位和第12位再次遇到. 注意我们不会同时计数A在第二位,但是在第7位和第12位,我们在再次看到A之前已经看到了许多新的Cookie,因此我们将该实例计算在内.所以本质上我想要这样的东西:

So lets say we have 5 cookies 'A,B,C,D,E'. Now I want to count if any cookie has reoccurred after a new cookie was encountered. For example, in the above example, cookie A was encountered again at 7th place and then at 12th place also. NOTE We wouldn't count A at 2nd place as it came simultaneously, but at position 7th and 12th we had seen many new cookies before seeing A again, hence we count that instance. So essentially I want something like this:

Sno Cookie  Count
 1     A     2
 2     B     1
 3     C     0
 4     D     2
 5     E     0

任何人都可以在此背后给我逻辑或python代码吗?

Can anyone give me logic or python code behind this?

推荐答案

一种方法是首先消除连续的Cookies,然后找到在使用duplicated之前可以看到Cookie的位置,最后groupby cookie并获得总和:

One way to do this would be to first get rid of consecutive Cookies, then find where the Cookie has been seen before using duplicated, and finally groupby cookie and get the sum:

no_doubles = df[df.Cookie != df.Cookie.shift()]

no_doubles['dups'] = no_doubles.Cookie.duplicated()

no_doubles.groupby('Cookie').dups.sum()

这给您:

Cookie
A    2.0
B    1.0
C    0.0
D    2.0
E    0.0
Name: dups, dtype: float64

这篇关于计数python中值的重新出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆