按列分组的数据框上R中的行之间的差异 [英] Difference between rows in R on dataframe grouped by column

查看:115
本文介绍了按列分组的数据框上R中的行之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望通过app_name获得不同版本的计数差异。我的数据集如下所示:app_name,version_id,count,[difference]

I'm looking to get the difference in counts by version by app_name. My dataset looks like this: app_name, version_id, count, [difference]

这是数据集

    data = structure(list(app_name = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 
2L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"), version_id = c(1, 
1.1, 2.3, 2, 3.1, 3.3, 4, 1.1, 2.4), count = c(600L, 620L, 620L, 
200L, 200L, 250L, 250L, 15L, 36L)), .Names = c("app_name", "version_id", 
"count"), class = "data.frame", row.names = c(NA, -9L))

给出此data.frame,如何获得app_name和amp; version_id?每个应用程序的初始(第一)版本差异为零,因为两者之间没有差异。

Given this data.frame, how can I get the lagged difference in count by both app_name & version_id? the initial (first) version diff for each app would be zero, since there would be no difference.

以下是使用最终的差异列显示最终结果的示例

Here is an example of what the final results would look like with that final 'diff' column

structure(list(app_name = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 
2L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"), version_id = c(1, 
1.1, 2.3, 2, 3.1, 3.3, 4, 1.1, 2.4), count = c(600L, 620L, 620L, 
200L, 200L, 250L, 250L, 15L, 36L), diff = c(0, 20, 0, 0, 0, 1.25, 
0, 0, 2.4)), .Names = c("app_name", "version_id", "count", "diff"
), class = "data.frame", row.names = c(NA, -9L))


推荐答案

尝试使用 dplyr lag

library(dplyr)
data %>% group_by(app_name) %>%
         mutate(diffvers = version_id - dplyr::lag(version_id, default = version_id[1]),
                diffcount = count - dplyr::lag(count, default = count[1]))

Source: local data frame [9 x 5]
Groups: app_name [3]

  app_name version_id count diffvers diffcount
    (fctr)      (dbl) (int)    (dbl)     (int)
1        a        1.0   600      0.0         0
2        a        1.1   620      0.1        20
3        a        2.3   620      1.2         0
4        b        2.0   200      0.0         0
5        b        3.1   200      1.1         0
6        b        3.3   250      0.2        50
7        b        4.0   250      0.7         0
8        c        1.1    15      0.0         0
9        c        2.4    36      1.3        21

这篇关于按列分组的数据框上R中的行之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆