dplyr过滤器中的非标准评估(NSE)从MySQL中提取数据 [英] Non-standard evaluation (NSE) in dplyr's filter_ & pulling data from MySQL
问题描述
我想从动态过滤器的sql服务器中提取一些数据。我以下列方式使用了伟大的R包dplyr:
#创建过滤器
filter_criteria =〜column1 %in_vector
#连接到数据库
connection< - src_mysql(dbname< - mydbname,
user< - myusername,
password< - mypwd,
host< - myhost)
#Get数据
数据< - 连接%>%
tbl(mytable)%> %#指定哪个表
filter _(。dots = filter_criteria)%>%#non标准评估过滤器
collect()#Pull数据
这段代码工作正常,但现在我想在表的所有列上循环,因此我想将过滤器写为: / p>
#Dynamic过滤器
i< - 2#在此我有一个循环,例如
which_column< - paste0(column,i)
filter_criteria< - 〜which_column%in%some_vector
然后重新申请f第一个代码与更新的过滤器。
不幸的是,这种方法没有给出预期的结果。实际上它没有给出任何错误,但是甚至没有将任何结果引入到R中。
特别是,我看了一下两个代码生成的SQL查询,并且有一个重要的区别。 p>
虽然第一个工作代码生成一个格式的查询:
SELECT ... FROM ... WHERE
`column1` IN ....
(`登录列名),第二个生成一个查询形式:
SELECT ... FROM。 .. WHERE
'column1'IN ....
('有没有人有任何建议如何制定过滤条件使其工作?
它与SQL无关。 R中的这个例子不起作用:
df< - data.frame(
v1 = sample ,10,replace = TRUE),
v2 = sample(5,10,replace = TRUE)
)
df%>%filter_(〜v1== 1)
它不起作用,因为您需要传递到 filter _
表达式〜v1 == 1
- 不是表达式〜v1== 1
。
dplyr version> = 0.6
为了解决这个问题,只需使用引号运算符 quo
和dequoting操作符 !!
library(dplyr)
which_column =(v1)
df%>%filter(!! which_column == 1)
dplyr version< 0.6
要解决问题,请使用lazyeval软件包中的 interp
函数。
library(lazyeval)
filter_criteria< - interp(〜which_column == 1,which_column = as.name(v1))
df%>%filter_(filter_criteria)
I'd like to pull some data from a sql server with a dynamic filter. I'm using the great R package dplyr in the following way:
#Create the filter
filter_criteria = ~ column1 %in% some_vector
#Connect to the database
connection <- src_mysql(dbname <- "mydbname",
user <- "myusername",
password <- "mypwd",
host <- "myhost")
#Get data
data <- connection %>%
tbl("mytable") %>% #Specify which table
filter_(.dots = filter_criteria) %>% #non standard evaluation filter
collect() #Pull data
This piece of code works fine but now I'd like to loop it somehow on all the columns of my table, thus I'd like to write the filter as:
#Dynamic filter
i <- 2 #With a loop on this i for instance
which_column <- paste0("column",i)
filter_criteria <- ~ which_column %in% some_vector
And then reapply the first code with the updated filter.
Unfortunately this approach doesn't give the expected results. In fact it does not give any error but doesn't even pull any result into R. In particular, I looked a bit into the SQL query generated by the two pieces of code and there is one important difference.
While the first, working, code generates a query of the form:
SELECT ... FROM ... WHERE
`column1` IN ....
(` sign in the column name), the second one generates a query of the form:
SELECT ... FROM ... WHERE
'column1' IN ....
(' sign in the column name)
Does anyone have any suggestion on how to formulate the filtering condition to make it work?
It's not really related to SQL. This example in R does not work either:
df <- data.frame(
v1 = sample(5, 10, replace = TRUE),
v2 = sample(5,10, replace = TRUE)
)
df %>% filter_(~ "v1" == 1)
It does not work because you need to pass to filter_
the expression ~ v1 == 1
— not the expression ~ "v1" == 1
.
dplyr version >= 0.6
To solve the problem, simply use the quoting operator quo
and the dequoting operator !!
library(dplyr)
which_column = quot(v1)
df %>% filter(!!which_column == 1)
dplyr version < 0.6
To solve the problem, use the function interp
from the lazyeval package.
library(lazyeval)
filter_criteria <- interp(~ which_column == 1, which_column = as.name("v1"))
df %>% filter_(filter_criteria)
这篇关于dplyr过滤器中的非标准评估(NSE)从MySQL中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!