如何使用dplyr在R中的数据框中配对行? [英] How to pair rows in a data frame in R with dplyr?

查看:103
本文介绍了如何使用dplyr在R中的数据框中配对行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含来自控件和实验组的观察数据,每个主题都有复制:
这是我的数据框的一个例子:

I have a dataframe containing observations from the control and the experimental group with replicates for each subject: Here is an example of my dataframe:

subject  group    replicate value
  A     control      1       10
  A     control      2       15
  A     experim      1       40
  A     experim      2       45
  B     control      1       5
  B     experim      1       30
  C     control      1       50
  C     experim      1       NA

我想将每个控制观察值与其相应的实验值进行配对,以计算配对值之间的比值。
所需的输出:

I'd like to pair each control observation with its corresponding experimental one in order to calculate the ratio between the paired values. The desired output:

subject  replicate  control   experim  ratio
  A         1         10        40       4
  A         2         15        45       3
  B         1          5        30       6
  C         1         50        NA       NA



<请注意,主题的复制次数可能会有所不同(A有两个重复,B只有一个,C有一个缺少值)。理想情况下,我希望看到用dplyr和管道实现。

Please, note that the number of replicates for subjects can vary (A has two replicates, B only one, C has one with a missing value). Ideally, I'd like to see this implemented with dplyr and pipes.

推荐答案

我们可以使用 dcast data.table 转换为宽格式,然后通过将experim与control分开来创建ratio列。 p>

We can use dcast from data.table to convert to 'wide' format, then create the 'ratio' column by dividing 'experim' with 'control'

library(data.table)
dcast(setDT(df1), subject+replicate~group, value.var="value")[,
            ratio:= experim/control][]
#     subject replicate control experim ratio
#1:       A         1      10      40     4
#2:       A         2      15      45     3
#3:       B         1       5      30     6
#4:       C         1      50      NA    NA






或使用 spread tidyr 转换为'wide'格式,然后用 mutate 创建'ratio'。


Or using spread from tidyr to convert to 'wide' format and then create the 'ratio' with mutate.

library(dplyr)
library(tidyr)
spread(df1, group, value) %>% 
        mutate(ratio = experim/control)
#    subject replicate control experim ratio
#1       A         1      10      40     4
#2       A         2      15      45     3
#3       B         1       5      30     6
#4       C         1      50      NA    NA






或使用 reshape base R

transform(reshape(df1, idvar = c("subject", "replicate"), 
   timevar="group", direction="wide"), ratio = value.experim/value.control)

这篇关于如何使用dplyr在R中的数据框中配对行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆