比较下一行,分组,data.table [英] Compare to next row, grouped, data.table

查看:107
本文介绍了比较下一行,分组,data.table的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含每个用户每周浏览量的数据框。我想确定,对于每个用户,他们的意见是增加,减少,还是保持相同的某一事件后。我的数据如下所示:

I have a data frame containing number of page views per user, per week. I want to determine, for each user, whether their views increased, decreased, or stayed the same after a certain event. My data looks like this:

Userid week xeventinweek numviews
Alice   1    2            5
Alice   2    0            3
Alice   4    1            6
Bob     2    2            3
Bob     3    0            5

所以在这种情况下,爱丽丝的意见在她在第1周有2个事件后减少,她在第2周没有事件来衡量。鲍勃,但是,他的观点从3增加到5周,他有两个事件。

So in this case, Alice's views decreased after she had 2 events in week 1, and she had no events in week 2 to measure by. Bob, however, increased his views from 3 to 5 the week after he had two events.

我想获得一张表格,每周至少有一个活动的观看次数有所不同。所以它应该看起来像这样:

I would like to get a table with the difference in views for every week that had at least one event. So it should look something like this:

Userid  week xeventinweek numviews numnextweek difference
Alice    1      2           5          3               -2
Alice    4      1           6          NA              NA #the row for week 2 is missing because there were no events then for Alice
Bob      2      2           3          5                2

没有必要同时拥有numnextweek和difference列 - 无论是还是确定。

It is not essential to have both the numnextweek and difference columns - either or is ok.

我能够使用data.table和一个for循环,但是运行这么长时间是不可行的。我想到使用滚动连接,但它似乎不可能与分组数据(即,它需要单独完成每个Userid。)我如何使用data.table的本机功能?

I was able to do this using data.table and a for loop, but it took so long to run that it wasn't feasible. I thought of using a rolling join, but it doesn't seem possible with grouped data (i.e. it would need to be done individually for each Userid.) How can I do this using data.table's native functionality?

推荐答案

使用匹配

dat[, numnextweek := numviews[match(week + 1, week)] , by=Userid]
dat[, difference := numviews - numnextweek , by=Userid]
dat[xeventinweek != 0]

#   Userid week xeventinweek numviews numnextweek difference
#1:  Alice    1            2        5           3          2
#2:  Alice    4            1        6          NA         NA
#3:    Bob    2            2        3           5         -2

这篇关于比较下一行,分组,data.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆