R：按组和添加的差异 [英] R: Differences by group and adding

查看：159 发布时间：2017/3/25 23:41:49 r dataframe row

本文介绍了R：按组和添加的差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想知道如何做这个操作更简单。

想象一下，我有一个这样的数据框架：

 $ $ $ $ $ $ $ $ $ $ b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b $ b TT<  -  rep（1：4,3）
 ZZ < -  ave（XX * TT，ID，FUN = cumsum）
 DF<  -  data.frame（ID，XX， ZZ）
 
 ID TT XX ZZ 
 1 1 0.266 0.266 
 1 2 0.372 1.010 
 1 3 0.573 2.729 
 1 4 0.908 6.361 
 2 1 0.202 0.202 
 2 2 0.898 1.998 
 2 3 0.945 4.833 
 2 4 0.661 7.477 
 3 1 0.629 0.629 
 3 2 0.062 0.753 
 3 3 0.206 1.371 
 3 4 0.177 2.079

我希望得到每列，由ID的组中的增量（两个连续元素之间的差异）。保持第一个（好像有一个零）。

  ID TT XX ZZ 
 1 1 0.266 0.266 
 1 2 0.106 0.744 
 1 3 0.201 1.719 
 1 4 0.335 3.632 
 2 1 0.202 0.202 
 2 2 0.696 1.796 
 2 3 0.047 2.835 
 2 4 -0.284 2.644 
 3 1 0.629 0.629 
 3 2 -0.567 0.124 
 3 3 0.144 0.618 
 3 4 -0.029 0.708

我尝试过

  ave （DF [3：4]，DF $ ID，FUN = function（x）diff（c（0，x）））

但它不起作用，它会产生错误：

  r [i1]中的错误 - r [-length（r）:-( length（r） -  lag + 1L）]：
二进制运算符的非数值参数

有没有简单的方法？

我发现我可以得到正确的输出： / p>

  ave（DF [3：4]，DF $ ID，FUN = function（x）
 sapply（x，FUN = function（y）diff（c（0，y））））

但它得到相当漫长而复杂的如此简单的操作。
我发现我也可以通过使用data.table来做到这一点，但是我更愿意使用base R来实现。

  setDT（DF）
 DF [，lapply（.SD，FUN = function（x）diff（c（0，x））），keyby = ID]

我也不知道如何插入新行（大量零）每个组的开始或给定的一些条件。

  ID XX ZZ 
 1 0 0 
 1 0.266 0.266 
 1 0.372 1.010 
 1 0.573 2.729 
 1 0.908 6.361 
 2 0 0 
 2 0.202 0.202 
 2 0.898 1.998 
 2 0.945 4.833 
 2 0.661 7.477 
 3 0 0 
 3 0.629 0.629 
 3 0.062 0.753 
 3 0.206 1.371 
 3 0.177 2.079

我尝试过：

  ave（DF [3：4] DF $ ID，FUN = function（x）sapply（x，FUN = function（y）（c（0，y））））

警告：

 数据长度[10]不是子
行的数量[4]

我想这样做的一般方法是工作具有行的索引。

PD：我已经更新了这篇文章。

试图做更简单我已经删除了TT列，但我已经注意到这一点很重要。

我的解决方案假设表是由TT排序的，但有时候不是这样的。
我真正想要的是：

  XX1 
 XX2-XX1 
 XX3-XX2 
 XX4-XX3

我们得到的子索引不是从表上的位置，而是从T 。
我不知道是否更有效，首先通过TT排序列或创建一个paste（）语法。

解决方案

我想您将需要在相关列中使用 lapply（），如 ave（）将不参加其第一个参数列表。尝试这样：

  df [-1]<  -  lapply（
 df [-1]，
函数（x）ave（x，df $ ID，FUN = function（x）c（x [1]，diff（x）））
）
  pre> 
 
 其中给出了更新的 df  
 
 
  ID XX ZZ 
 1 1 0.266 0.266 
 2 1 0.106 0.744 
 3 1 0.201 1.719 
 4 1 0.335 3.632 
 5 2 0.202 0.202 
 6 2 0.696 1.796 
 7 2 0.047 2.835 
 8 2 -0.284 2.644 
 9 3 0.629 0.629 
 10 3 -0.567 0.124 
 11 3 0.144 0.618 
 12 3 -0.029 0.708 
  
 
 
 数据： 
  df<  -  structure（list（ID = c（1L ，1L，1L，1L，2L，2L，2L，2L，3L，3L，
 3L，3L），XX = c（0.266,0.372,0.573,0.908,0.202,0.898,0.945，
 0.661,0.629,0.062,0.606,0.177），ZZ = c（0.266,1.01,2.729，
 6.361,0.22,1.998,4.833,7.477,0.629,0.75,1.37,1,279 
））。名= C（ ID， XX， ZZ）， class =data.frame，row.names = c（NA，
 -12L））
  
 
I would like to know how to do this operation simpler.

Imagine I have a data.frame like this one:
set.seed(1)
ID <- rep(1:3,each=4)
XX <- round(runif(12),3)
TT <- rep(1:4, 3)
ZZ <- ave(XX*TT,ID, FUN = cumsum)
DF <- data.frame(ID, XX,  ZZ)   

ID  TT   XX    ZZ
1    1   0.266 0.266
1    2   0.372 1.010
1    3   0.573 2.729
1    4   0.908 6.361
2    1   0.202 0.202
2    2   0.898 1.998
2    3   0.945 4.833
2    4   0.661 7.477
3    1   0.629 0.629
3    2   0.062 0.753
3    3   0.206 1.371
3    4   0.177 2.079
I' would like to get, for each column, the increments (differences between two consecutive elements) by groups of ID. Keeping the first one (as if there is a previous zero).
ID    TT      XX    ZZ
 1    1    0.266 0.266
 1    2    0.106 0.744
 1    3    0.201 1.719
 1    4    0.335 3.632
 2    1    0.202 0.202
 2    2    0.696 1.796
 2    3    0.047 2.835
 2    4   -0.284 2.644
 3    1    0.629 0.629
 3    2   -0.567 0.124
 3    3    0.144 0.618
 3    4   -0.029 0.708
I've tried with 
ave(DF[3:4],DF$ID,FUN=function(x) diff(c(0,x)))
but it doesn't work, it produces the error:
 Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : 
  non-numeric argument to binary operator 
Isn't there an easy way to do it?

I've found that I can get the proper output with:
ave(DF[3:4],DF$ID,FUN=function(x) 
  sapply(x, FUN=function(y) diff(c(0,y))))
but it gets quite long and complex for a so simple operation.
I've found that I can also do it by using data.table    but I prefer to be able to do it with base R.
setDT(DF)
DF[, lapply(.SD, FUN=function(x) diff(c(0,x)) ), keyby = ID ]
I also don't know how to insert a new row (plenty of zeroes) at the beginning of each group or given some condition.  
ID   XX    ZZ
1     0     0
1 0.266 0.266
1 0.372 1.010
1 0.573 2.729
1 0.908 6.361
2     0     0
2 0.202 0.202
2 0.898 1.998
2 0.945 4.833
2 0.661 7.477
3     0     0
3 0.629 0.629
3 0.062 0.753
3 0.206 1.371
3 0.177 2.079
I've tried with:   
ave(DF[3:4],DF$ID,FUN=function(x) sapply(x, FUN=function(y) (c(0,y))))   
warning:   
data length [10] is not a sub-multiple or multiple of the number of
rows [4]
I guess the general way to do it would be working with indexes of the rows.

PD:  I've updated the post.

Trying to do it simpler I had removed the TT column but I have leater noticed that is important.

My solution assumes that the table is ordered by TT, but sometimes it's not like that.
What I really want is:  
XX1
XX2-XX1
XX3-XX2
XX4-XX3
Where we get the subindexes not from the position on the table but from T.
I don't know whether is more effcicient to do it by first sorting the columns by TT  or  by creating a paste() syntax.
 解决方案 
I think you will need to use lapply() across the relevant columns, as ave() will not take a list in its first argument.  Try this:
df[-1] <- lapply(
    df[-1], 
    function(x) ave(x, df$ID, FUN = function(x) c(x[1], diff(x)))
)
which gives the updated df

   ID     XX    ZZ
1   1  0.266 0.266
2   1  0.106 0.744
3   1  0.201 1.719
4   1  0.335 3.632
5   2  0.202 0.202
6   2  0.696 1.796
7   2  0.047 2.835
8   2 -0.284 2.644
9   3  0.629 0.629
10  3 -0.567 0.124
11  3  0.144 0.618
12  3 -0.029 0.708

Data:
df <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L), XX = c(0.266, 0.372, 0.573, 0.908, 0.202, 0.898, 0.945, 
0.661, 0.629, 0.062, 0.206, 0.177), ZZ = c(0.266, 1.01, 2.729, 
6.361, 0.202, 1.998, 4.833, 7.477, 0.629, 0.753, 1.371, 2.079
)), .Names = c("ID", "XX", "ZZ"), class = "data.frame", row.names = c(NA, 
-12L))


                        
这篇关于R：按组和添加的差异的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R：按组和添加的差异 [英] R: Differences by group and adding

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R：按组和添加的差异 [英] R: Differences by group and adding

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭