如何调用返回data.table中的多行和多列的函数? [英] How to call a function that returns multiple rows and columns in a data.table?
本文介绍了如何调用返回data.table中的多行和多列的函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想在data.table内调用一个函数,该函数计算一组汇总统计信息,如下所示:
I want to call a function inside a data.table that calculates a set of summary statistics like the following:
summ.stats <- function(vec) {
list(
Min = min(vec),
Mean = mean(vec),
S.D. = sd(vec),
Median = median(vec),
Max = max(vec))
}
,我想在 data.table
的 j
中调用它:
DT <- data.table(a=c(1,2,3,1,2,3),b=c(1,4,3,2,1,4),c=c(2,3,4,5,2,1))
DT[, summ.stats(b), by=a]
这很好,我得到:
a Min Mean S.D. Median Max
1: 1 1 1.5 0.7071068 1.5 2
2: 2 1 2.5 2.1213203 2.5 4
3: 3 3 3.5 0.7071068 3.5 4
但是我有兴趣将多个变量传递给summ.stats。例如:
But I am interested in passing multiple variables to summ.stats. For example:
DT[, summ.stats(b, c), by=a]
我想得到类似的东西:
a Var Min Mean S.D. Median Max
1: 1 b 1 1.5 0.7071068 1.5 2
2: 2 b 1 2.5 2.1213203 2.5 4
3: 3 b 3 3.5 0.7071068 3.5 4
4: 1 c 2 3.5 2.1213203 3.5 5
5: 2 c 2 2.5 0.7071068 2.5 3
6: 3 c 1 2.5 2.1213203 2.5 4
执行此操作的最佳方法是什么?
What is the best way to do this?
推荐答案
或者,您可以按以下方式修改函数:
Alternatively you can modify your function as follows:
summ.stats <- function(vec) {
list(
Var = names(vec),
Min = sapply(vec, min),
Mean = sapply(vec, mean),
S.D. = sapply(vec, sd),
Median = sapply(vec, median),
Max = sapply(vec, max))
}
DT[, summ.stats(.SD), by=a] # no need for as.list(.SD) as Roger mentions
a Var Min Mean S.D. Median Max
1: 1 b 1 1.5 0.7071068 1.5 2
2: 1 c 2 3.5 2.1213203 3.5 5
3: 2 b 1 2.5 2.1213203 2.5 4
4: 2 c 2 2.5 0.7071068 2.5 3
5: 3 b 3 3.5 0.7071068 3.5 4
6: 3 c 1 2.5 2.1213203 2.5 4
这篇关于如何调用返回data.table中的多行和多列的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文