在R:子集或dplyr :: filter中,带有来自矢量的变量 [英] In R: subset or dplyr::filter with variable from vector
问题描述
df <-
data.frame(a=LETTERS[1:4],
b=rnorm(4)
)
vals <- c("B","D")
我可以过滤器/子集 df
的值在 val
中,
I can filter/subset df
with values in val
with:
dplyr::filter(df, a %in% vals)
subset(df, a %in% vals)
两者都给出:
a b
2 B 0.4481627
4 D 0.2916513
如果向量中有变量名该怎么办,例如:
What if I have a variable name in a vector, e.g.:
> names(df)[1]
[1] "a"
然后它不起作用-我猜是因为它的引用
Then it doesnt work - I guess because its quoted
dplyr::filter(df, names(df)[1] %in% vals)
[1] a b
<0 rows> (or 0-length row.names)
如何执行此操作?
更新(如果它的dplyr :: tbl_df(df)会怎样)
以下答案对数据有效.frames,但不适用于dplyr :: tbl_df包装的数据:
Answers below work fine for data.frames, but not for dplyr::tbl_df wrapped data:
df<-dplyr::tbl_df(df)
dplyr::filter(df, df[,names(df)[1]] %in% vals)
不起作用(我以为 tbl_df
是df之上的简单包装?)
Does not work (I thought tbl_df
was a simple wrap on top of df ? )
再次工作:
dplyr::filter(df, as.data.frame(df)[,names(df)[1]] %in% vals)
最终更新:它与tbl_df()一起使用lazyeval :: interp
请参阅下面的AndreyAkinshin解决方案。
See AndreyAkinshin's solution below.
推荐答案
您可以使用 df [, a]
或 df [,1]
:
df <- data.frame(a = LETTERS[1:4], b = rnorm(4))
vals <- c("B","D")
dplyr::filter(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
dplyr::filter(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
使用dplyr :: tbl_df(df)
有些魔术具有 lazyeval :: interp
帮助我们!
Some magic with lazyeval::interp
helps us!
df <- dplyr::tbl_df(df)
expr <- lazyeval::interp(quote(x %in% y), x = as.name(names(df)[1]), y = vals)
df %>% filter_(expr)
# Source: local data frame [2 x 2]
#
# a b
# 1 B 0.4481627
# 2 D 0.2916513
这篇关于在R:子集或dplyr :: filter中,带有来自矢量的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!