如何使用R中的dplyr在多个列中与数据框中的行进行配对? [英] How to pair rows in a data frame with many columns using dplyr in R?
问题描述
我有一个包含来自控件和实验组的多个观察数据的数据框,每个对象的重复。
I have a dataframe containing multiple observations from the control and the experimental cohorts with replicates for each subject.
这是我的数据框的一个例子:
Here is an example of my dataframe:
subject cohort replicate val1 val2
A control 1 10 0.1
A control 2 15 0.3
A experim 1 40 0.7
A experim 2 45 0.9
B control 1 5 0.3
B experim 1 30 0.0
C control 1 50 0.5
C experim 1 NA 1.0
我想将每个控制观察值与其对应的实验值对应于每个值,以计算两对之间的比率。期望的输出将如下所示:
I'd like to pair each control observation with its corresponding experimental one for each value to calculate the ratio between the pairs. The desired output would look something like this:
subject replicate ratio_val1 ratio_val2
A 1 4 7
A 2 3 3
B 1 6 0
C 1 NA 2
理想情况下,我想看看这是用dplyr和管道实现的。
Ideally, I'd like to see this implemented with dplyr and pipes.
推荐答案
我们可以使用 data.table
通过将数据集重新整形为宽格式。
We can use data.table
by reshaping the dataset to 'wide' format.
library(data.table)
dcast(setDT(df1), subject+replicate~cohort, value.var = c("val1", "val2"))[,
paste0("ratio_", names(df1)[4:5]) := Map(`/`, .SD[,
grep("experim", names(.SD)), with = FALSE],
.SD [, grep("control", names(.SD)), with = FALSE])][, (3:6) := NULL][]
# subject replicate ratio_val1 ratio_val2
# 1: A 1 4 7
# 2: A 2 3 3
# 3: B 1 6 0
# 4: C 1 NA 2
或者在subject,replicate分组之后,我们循环使用'val'列,并将'experim'的'val'的相应元素与'control'的元素相分离
Or after grouping with 'subject', 'replicate', we loop over the 'val' columns and divide the corresponding elements of 'val' for 'experim' with that of 'control'
setDT(df1)[, lapply(.SD[, grep("val", names(.SD)), with = FALSE],
function(x) x[cohort =="experim"]/x[cohort =="control"]) ,
by = .(subject, replicate)]
或者我们可以使用收集/传播
tidyr
library(dplyr)
library(tidyr)
df1 %>%
gather(Var, Val, val1:val2) %>%
spread(cohort, Val) %>%
group_by(subject, replicate, Var) %>%
summarise(ratio = experim/control) %>% spread(Var, ratio)
# subject replicate val1 val2
# <chr> <int> <dbl> <dbl>
# 1 A 1 4 7
# 2 A 2 3 3
# 3 B 1 6 0
# 4 C 1 NA 2
这篇关于如何使用R中的dplyr在多个列中与数据框中的行进行配对?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!