用subset()代替data.frames的列表 [英] A replacement for `subset()` for a list of data.frames
问题描述
函数foo1
可以使用一个或多个请求的变量(例如,by = ESL == 1
或by == ESL == 1 & type == 4
)对数据帧列表进行子集化(使用subset()
).
Function foo1
can subset (using subset()
) a list of data.frames by one or more requested variables (e.g., by = ESL == 1
or by == ESL == 1 & type == 4
).
但是,我知道 在R中使用subset()
的危险因此,我想知道在下面的foo1
中,我可以用什么代替subset()
来获得相同的输出?
However, I'm aware of the danger of using subset()
in R. Thus, I wonder in foo1
below, what I can use instead of subset()
to get the same output?
foo1 <- function(data, by){
s <- substitute(by)
L <- split(data, data$study.name) ; L[[1]] <- NULL
lapply(L, function(x) do.call("subset", list(x, s))) ## What to use instead of `subset`
## to get the same output?
}
# EXAMPLE OF USE:
D <- read.csv("https://raw.githubusercontent.com/izeh/i/master/k.csv", header=TRUE) # DATA
foo1(D, ESL == 1)
推荐答案
您可以使用该语言进行计算.基于我对在$
登录R后使用替代项"的答案:
You can compute on the language. Building on my answer to "Working with substitute after $
sign in R":
foo1 <- function(data, by){
s <- substitute(by)
L <- split(data, data$study.name) ; L[[1]] <- NULL
E <- quote(x$a)
E[[3]] <- s[[2]]
s[[2]] <- E
eval(bquote(lapply(L, function(x) x[.(s),])))
}
foo1(D, ESL == 1)
对于任意子集表达式,这变得更加复杂.您需要一个递归函数,该函数爬网解析树并将调用插入到$
的正确位置.
This gets more complex for arbitrary subset expressions. You'd need a recursive function that crawls the parse tree and inserts the calls to $
at the right places.
就我个人而言,我只使用package data.table,因为它不需要$
,因此可以更轻松地使用它,即,您只需更改s
就可以执行eval(bquote(lapply(L, function(x) setDT(x)[.(s),])))
.太太,我根本不会这样做.子集化之前,确实没有任何理由进行拆分.
Personally, I'd just use package data.table where this is easier because you don't need $
, i.e., you can just do eval(bquote(lapply(L, function(x) setDT(x)[.(s),])))
without changing s
. OTOH, I wouldn't do this at all. There is really no reason to split before subsetting.
这篇关于用subset()代替data.frames的列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!