在r中使用dplyr构建组之间的差异 [英] Build difference between groups with dplyr in r

查看:100
本文介绍了在r中使用dplyr构建组之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用dplyr,我想知道是否可以计算一行中的组之间的差异。如下面的小例子,任务是计算A和B组标准化分变量之间的差异。

  library(dplyr)
#创建一个小的data.frame
GROUP< - rep(c (A,B),每个= 10)
NUMBE< - rnorm(20,50,10)
datf< - data.frame(GROUP,NUMBE)

datf2< - datf%%group_by(GROUP)%。%mutate(cent =(NUMBE - mean(NUMBE))/ sd(NUMBE))

gA< - datf2 %。%ungroup()%。%filter(GROUP ==A)%。%select(cent)
gB < - datf2%。%ungroup()%。%filter(GROUP ==B )%。%select(cent)

gA - gB

当然,创建不同的对象当然不是问题 - 但是是否有更多的内置方式执行此任务?有些更像这样不像下面的幻想代码?

  datf2%。%summary(filter(GROUP ==A,select (分)) - 过滤器(GROUP ==B,select(cent)))

谢谢你!

解决方案

鉴于我们每组中有10个,添加一个索引1:10,1:10,并总结一下差异:

 > datf2 $ entry = c(1:10,1:10)
> datf2%。%ungroup()%。%group_by(entry)%。%summary(d = cent [1] -cent [2])
来源:本地数据框[10 x 2]

条目d
1 1 -0.8272879
2 2 -0.9159827
3 3 -0.5064762
4 4 0.4211639
5 5 1.3681720
6 6 3.3430289
7 7 1.0086822
8 8 -0.6163907
9 9 -0.7325220
10 10 -2.5423875

比较:

 > gA  -  gB 
cent
1 -0.8272879
2 -0.9159827
3 -0.5064762
4 0.4211639
5 1.3681720
6 3.3430289
7 1.0086822
8 -0.6163907
9 -0.7325220
10 -2.5423875

有没有办法将条目字段注入数据或 dplyr 调用?我不确定,似乎依靠功能了解数据太多...


I am using dplyr and I am wondering whether it is possible to compute differences between groups in one line. As in the small example below, the task is to compute the difference between groups A and Bs standardized "cent" variables.

library(dplyr)
# creating a small data.frame
GROUP <- rep(c("A","B"),each=10)
NUMBE <- rnorm(20,50,10)
datf <- data.frame(GROUP,NUMBE)

datf2 <- datf %.% group_by(GROUP) %.% mutate(cent = (NUMBE - mean(NUMBE))/sd(NUMBE))

gA <- datf2 %.% ungroup() %.% filter(GROUP == "A") %.% select(cent)
gB <- datf2 %.% ungroup() %.% filter(GROUP == "B") %.% select(cent)

gA - gB

This is of course no problem by creating different objects - but is there a more "built in" way of performing this task? Something more like this not working fantasy code below?

datf2 %.% summarize(filter(GROUP == "A",select(cent)) - filter(GROUP == "B",select(cent)))

Thank you!

解决方案

Given we have 10 of each group, add an index 1:10, 1:10 and summarize over that with difference:

> datf2$entry=c(1:10,1:10)
> datf2 %.% ungroup() %.% group_by(entry) %.% summarize(d=cent[1]-cent[2])
Source: local data frame [10 x 2]

   entry          d
1      1 -0.8272879
2      2 -0.9159827
3      3 -0.5064762
4      4  0.4211639
5      5  1.3681720
6      6  3.3430289
7      7  1.0086822
8      8 -0.6163907
9      9 -0.7325220
10    10 -2.5423875

compare:

> gA - gB
         cent
1  -0.8272879
2  -0.9159827
3  -0.5064762
4   0.4211639
5   1.3681720
6   3.3430289
7   1.0086822
8  -0.6163907
9  -0.7325220
10 -2.5423875

Is there a way to inject the entry field into the data or the dplyr call? I'm not sure, it seems to rely on the functions knowing too much about the data...

这篇关于在r中使用dplyr构建组之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆