dplyr :: group_by（）之后的一个组中的diff操作 [英] diff operation within a group, after a dplyr::group_by()

查看：433 发布时间：2018/5/30 14:01:14 r group-by diff dplyr

本文介绍了dplyr :: group_by（）之后的一个组中的diff操作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有这个data.frame（带有3个变量）

Let's say I have this data.frame (with 3 variables)

ID  Period  Score
123 2013    146
123 2014    133
23  2013    150
456 2013    205
456 2014    219
456 2015    140
78  2012    192
78  2013    199
78  2014    133
78  2015    170

使用dplyr我可以将它们按ID并过滤出现不止一次的这些ID

Using dplyr I can group them by ID and filter these ID that appear more than once

data <- data %>% group_by(ID) %>% filter(n() > 1)

现在，我想要实现的是添加一列即：
差额=期间P的得分 - 期间得分P-1
得到像这样的东西：

Now, what I like to achieve is to add a column that is: Difference = Score of Period P - Score of Period P-1 to get something like this:

ID  Period  Score   Difference
123 2013    146 
123 2014    133 -13
456 2013    205 
456 2014    219 14
456 2015    140 -79
78  2012    192 
78  2013    199 7
78  2014    133 -66
78  2015    170 37

在电子表格中执行此操作非常简单，但我不知道如何在R中实现此功能。

感谢您提供任何帮助或指导。

It is rather trivial to do this in a spreadsheet, but I have no idea on how I can achieve this in R.
Thanks for any help or guidance.

推荐答案

这是另一个使用 lag 的解决方案。根据用例，它可能比 diff 更方便，因为 NAs 清楚地表明特定的值没有前者，而使用 diff 的 0 可能是a）缺少前驱或b）两者之间相减的结果

Here is another solution using lag. Depending on the use case it might be more convenient than diff because the NAs clearly show that a particular value did not have predecessor whereas a 0 using diff might be the result of a) a missing predecessor or of b) the subtraction between two periods.

data %>% group_by(ID) %>% filter(n() > 1) %>%
  mutate(
    Difference = Score - lag(Score)
    )

#   ID Period Score Difference
# 1 123   2013   146         NA
# 2 123   2014   133        -13
# 3 456   2013   205         NA
# 4 456   2014   219         14
# 5 456   2015   140        -79
# 6  78   2012   192         NA
# 7  78   2013   199          7
# 8  78   2014   133        -66
# 9  78   2015   170         37

这篇关于dplyr :: group_by（）之后的一个组中的diff操作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr :: group_by（）之后的一个组中的diff操作 [英] diff operation within a group, after a dplyr::group_by()

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

dplyr :: group_by（）之后的一个组中的diff操作 [英] diff operation within a group, after a dplyr::group_by()

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭