顺序计数重复的条目 [英] Sequentially counting repeated entries

查看:90
本文介绍了顺序计数重复的条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在从事一个项目,该项目必须根据某人是否编辑站点来衡量某人在站点上的活动.我有一个看起来与此相似的数据框:

df = pd.DataFrame({"x":["a", "b", "c", "b","b"],
               "y":["red", "blue", "green", "yellow","red"],
               "z":[1,2,3,4,5]})

我想在数据框中添加一列,以便它使用"z"列作为事件发生时间的度量来计算重复值的数量(编辑数,即x列). /p>

例如具有以下附加列:

df["activity"] = pd.Series([1,1,1,2,3])

我该如何在Python中做到最好?不确定我最好的方法是什么.

解决方案

groupbycumcount

df['activity'] = df.groupby('x').cumcount() + 1
df

   x       y  z  activity
0  a     red  1         1
1  b    blue  2         1
2  c   green  3         1
3  b  yellow  4         2
4  b     red  5         3

I am currently working on a project where I have to measure someones activity over time on a site, based on whether they edit a site. I have a data frame that looks similar to this:

df = pd.DataFrame({"x":["a", "b", "c", "b","b"],
               "y":["red", "blue", "green", "yellow","red"],
               "z":[1,2,3,4,5]})

I want to add a column to the dataframe such that it counts the number of repeated values (number of edits, which is column x) there are, using the "z" column as the measure of when the events happened.

E.g. to have an additional column of:

df["activity"] = pd.Series([1,1,1,2,3])

How would I best go about this in Python? Not sure what my best approach here is.

解决方案

groupby and cumcount

df['activity'] = df.groupby('x').cumcount() + 1
df

   x       y  z  activity
0  a     red  1         1
1  b    blue  2         1
2  c   green  3         1
3  b  yellow  4         2
4  b     red  5         3

这篇关于顺序计数重复的条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆