如何将一组函数应用于R data.frame中的分组变量的每个组 [英] How to apply a set of functions to each group of a grouping variable in R data.frame

查看：226 发布时间：2017/3/26 2:12:05 r dataframe rescale

本文介绍了如何将一组函数应用于R data.frame中的分组变量的每个组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在一步中重新整理R 中的data.frame。
简而言之，对象（x1到x6）的值的更改是逐行可见的（从1990到1995）：

 > tab1 [1:10，]＃原始数据见tab1的图块
 id值年
 1 x1 7 1990 
 2 x1 10 1991 
 3 x1 11 1992 
 4 x1 7 1993 
 5 x1 3 1994 
 6 x1 1 1995 
 7 x2 6 1990 
 8 x2 7 1991 
 9 x2 9 1992 
 10 x2 5 1993

我可以一步一步地重新整形，有没有人知道如何一步？

原始数据
表1 - 看到所有时间序列中的最小值为0 / p>

Step1：
表2 - 重新缩放每个每个的最小值将等于0 。
所有时间都在x轴上下降。

Step2：
表3 - 在每个时间轴上应用 diff（）函数。

Step3：
表4 - 对每个时间序列应用 sort（）函数。

我希望图片清楚足够了解每一步。

所以最终的表格如下所示：
> tab4 [1:10，] id值时间 1 x1 -4 1 2 x1 -4 2 3 x1 -2 3 4 x1 1 4 5 x1 3 5 6 x2 -4 1 7 x2 -3 2 8 x2 1 3 9 x2 1 4 10 x2 2 5

＃源数据： tab1 < - data.frame（id = rep（c（x1，x2，x3，x4，x5，x6），每个= 6）， value = c（7,10,11,7， 3,1,6,7,9,5,2,3,11,9,7,9,1， 0,1,2,2,4,7,4,2,3,1， 6,4,2,3,5,4,3,5,6）， year = rep（c（1990：1995），times = 6）） tab2< - data.frame（id = rep（c（x1，x2，x3，x4，x5，x6），each = 6）， value = c ，9,10,6,2,0,4,5,7,3,0,1,11,9,7,9,1,0， 0,1,1,3,6,3 ，1,2,0,5,3,1,0,2,1,0,2,3）， year = rep（c（1990：1995），times = 6）） tab3< - data.frame（id = rep（c（x1，x2，x3，x4，x5，x6），each = 5）， value = c（3,1，-4，-4，-2,1,2，-4，-3,1，-2，-2,2，-8，-1， 1 ，0,2,3，-3,1，-2,5，-2，-2,2，-1，1,2,1）， time = rep（c（1：5），times = 6）） tab4< - data.frame（id = rep（c（x1，x2，x3，x4，x5 ），每个= 5），值= c（-4，-4，-2,1,3，-4，-3,1,1,2，-8，-2，-2， -1,2， -3,0,1,2,3，-2，-2，-2,1,5，-1，-1,1,2,2）， time = rep（c（1：5），times = 6））

解决方案

这听起来像是要为一组分组变量应用一组函数。在R（从基础R by 和自由插入到附加包如 plyr ， data.table 和 dplyr ）。我一直在学习如何使用包 dplyr ，并提出了以下解决方案。
require（dplyr） tab4 = tab1％>％ group_by（id）％>％＃group by id mutate（value = value - min（value），value = value-lag（value））％>％＃group min to 0，差值滞后1 na.omit％>％＃删除由滞后1引起的差异差异$ b $每个id中的排列（id，value）％>％＃按值的顺序 mutate（time = 1：length（value））％>％＃根据当前顺序，将时间变量从1到5 select（-year）＃remove year column to match final OP output

I need to reshape data.frame in R in one step. In short, change of values of objects (x1 to x6) is visible row by row (from 1990 to 1995):
> tab1[1:10, ] # raw data see plot for tab1 id value year 1 x1 7 1990 2 x1 10 1991 3 x1 11 1992 4 x1 7 1993 5 x1 3 1994 6 x1 1 1995 7 x2 6 1990 8 x2 7 1991 9 x2 9 1992 10 x2 5 1993
I am able to do reshaping step by step, does anybody know how do it in one step?

Original data Table 1 - see that minimal value from all timeseries is "0"

Step1: Table 2 - rescale each timeseries that each would have minimal value equal "0". All times fall down on x-axes.

Step2: Table 3 - apply diff() function on each timeline.

Step3: Table 4 - apply sort() function on each timeseries.

I hope the pictures are clear enough for understanding each step.

So final table looks like this:
> tab4[1:10, ] id value time 1 x1 -4 1 2 x1 -4 2 3 x1 -2 3 4 x1 1 4 5 x1 3 5 6 x2 -4 1 7 x2 -3 2 8 x2 1 3 9 x2 1 4 10 x2 2 5
# Source data: tab1 <- data.frame(id = rep(c("x1","x2","x3","x4","x5","x6"), each = 6), value = c(7,10,11,7,3,1,6,7,9,5,2,3,11,9,7,9,1, 0,1,2,2,4,7,4,2,3,1,6,4,2,3,5,4,3,5,6), year = rep(c(1990:1995), times = 6)) tab2 <- data.frame(id = rep(c("x1","x2","x3","x4","x5","x6"), each = 6), value = c(6,9,10,6,2,0,4,5,7,3,0,1,11,9,7,9,1,0, 0,1,1,3,6,3,1,2,0,5,3,1,0,2,1,0,2,3), year = rep(c(1990:1995), times = 6)) tab3 <- data.frame(id = rep(c("x1","x2","x3","x4","x5","x6"), each = 5), value = c(3,1,-4,-4,-2,1,2,-4,-3,1,-2,-2,2,-8,-1, 1,0,2,3,-3,1,-2,5,-2,-2,2,-1,-1,2,1), time = rep(c(1:5), times = 6)) tab4 <- data.frame(id = rep(c("x1","x2","x3","x4","x5","x6"), each = 5), value = c(-4,-4,-2,1,3,-4,-3,1,1,2,-8,-2,-2,-1,2, -3,0,1,2,3,-2,-2,-2,1,5,-1,-1,1,2,2), time = rep(c(1:5), times = 6))

解决方案
It sounds like you want to apply a set of functions to each group of a grouping variable. There are many ways to do this in R (from base R by and tapply to add-on packages like plyr, data.table, and dplyr). I've been learning how to use package dplyr, and came up with the following solution.
require(dplyr) tab4 = tab1 %>% group_by(id) %>% # group by id mutate(value = value - min(value), value = value - lag(value)) %>% # group min to 0, difference lag 1 na.omit %>% # remove NA caused by lag 1 differencing arrange(id, value) %>% # order by value within each id mutate(time = 1:length(value)) %>% # Make a time variable from 1 to 5 based on current order select(-year) # remove year column to match final OP output

这篇关于如何将一组函数应用于R data.frame中的分组变量的每个组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将一组函数应用于R data.frame中的分组变量的每个组 [英] How to apply a set of functions to each group of a grouping variable in R data.frame

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将一组函数应用于R data.frame中的分组变量的每个组 [英] How to apply a set of functions to each group of a grouping variable in R data.frame

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭