使用data.table进行聚合 [英] Using data.table to aggregate
问题描述
在SO用户的多个建议后,我终于试图将我的代码转换为使用 data.table
。
After multiple suggestions from SO users, I am finally trying to convert my code over to using data.table
.
library(data.table)
DT <- data.table(plate = paste0("plate",rep(1:2,each=5)),
id = rep(c("CTRL","CTRL","ID1","ID2","ID3"),2),
val = 1:10)
> DT
plate id val
1: plate1 CTRL 1
2: plate1 CTRL 2
3: plate1 ID1 3
4: plate1 ID2 4
5: plate1 ID3 5
6: plate2 CTRL 6
7: plate2 CTRL 7
8: plate2 ID1 8
9: plate2 ID2 9
10: plate2 ID3 10
我想要做的是取平均 DT [,val] / code>,当ID为CTRL时。
What I would like to do is take the average of DT[,val]
by plate when the id is "CTRL".
我通常会聚合
数据框,然后使用 match
将值映射回新列ctrl。
I would normally aggregate
the data frame, then use match
to map the values back to a new column, 'ctrl'.
使用 data.table
包可以得到:
DT[id=="CTRL",ctrl:=mean(val),by=plate]
> DT
plate id val ctrl
1: plate1 CTRL 1 1.5
2: plate1 CTRL 2 1.5
3: plate1 ID1 3 NA
4: plate1 ID2 4 NA
5: plate1 ID3 5 NA
6: plate2 CTRL 6 6.5
7: plate2 CTRL 7 6.5
8: plate2 ID1 8 NA
9: plate2 ID2 9 NA
10: plate2 ID3 10 NA
我需要的是:
DT <- data.table(plate = paste0("plate",rep(1:2,each=5)),
id = rep(c("CTRL","CTRL","ID1","ID2","ID3"),2),
val = 1:10,
ctrl = rep(c(1.5,6.5),each=5))
> DT
plate id val ctrl
1: plate1 CTRL 1 1.5
2: plate1 CTRL 2 1.5
3: plate1 ID1 3 1.5
4: plate1 ID2 4 1.5
5: plate1 ID3 5 1.5
6: plate2 CTRL 6 6.5
7: plate2 CTRL 7 6.5
8: plate2 ID1 8 6.5
9: plate2 ID2 9 6.5
10: plate2 ID3 10 6.5
最后,我想使用更复杂的值,但是我不知道如何选择特定的值,运行一些函数,然后使用数据框将这些值映射回相应的行。
Eventually I would like to use much more complicated selections of the values, but I do not know how to select specific values, run some function, then map those values back to the appropriate row using data frames.
推荐答案
这是你想要做的:
This is what you want to do:
DT[,ctrl:=mean(val[id=="CTRL"]),by=plate]
b $ b
plate id val ctrl
1: plate1 CTRL 1 1.5
2: plate1 CTRL 2 1.5
3: plate1 ID1 3 1.5
4: plate1 ID2 4 1.5
5: plate1 ID3 5 1.5
6: plate2 CTRL 6 6.5
7: plate2 CTRL 7 6.5
8: plate2 ID1 8 6.5
9: plate2 ID2 9 6.5
10: plate2 ID3 10 6.5
代码 DT [id ==CTRL,ctrl:= mean(val),by = plate]
未对 id ==CTRL
不是真的,因为当你使用 [
]的第一个参数时,第二个参数中的操作只对子集化的 data.table
进行。
Your original code DT[id=="CTRL",ctrl:=mean(val),by=plate]
did not make an assignment for rows where id=="CTRL"
was not true because, when you use the first argument of [
, you are subsetting; the operations in the second argument are only done for the subsetted data.table
.
这篇关于使用data.table进行聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!