dplyr过滤器中的非标准评估(NSE)从MySQL中提取数据 [英] Non-standard evaluation (NSE) in dplyr's filter_ & pulling data from MySQL

查看:182
本文介绍了dplyr过滤器中的非标准评估(NSE)从MySQL中提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从动态过滤器的sql服务器中提取一些数据。我以下列方式使用了伟大的R包dplyr:

 #创建过滤器
filter_criteria =〜column1 %in_vector
#连接到数据库
connection< - src_mysql(dbname< - mydbname,
user< - myusername,
password< - mypwd,
host< - myhost)
#Get数据
数据< - 连接%>%
tbl(mytable)%> %#指定哪个表
filter _(。dots = filter_criteria)%>%#non标准评估过滤器
collect()#Pull数据

这段代码工作正常,但现在我想在表的所有列上循环,因此我想将过滤器写为: / p>

  #Dynamic过滤器
i< - 2#在此我有一个循环,例如
which_column< - paste0(column,i)
filter_criteria< - 〜which_column%in%some_vector

然后重新申请f第一个代码与更新的过滤器。



不幸的是,这种方法没有给出预期的结果。实际上它没有给出任何错误,但是甚至没有将任何结果引入到R中。
特别是,我看了一下两个代码生成的SQL查询,并且有一个重要的区别。 p>

虽然第一个工作代码生成一个格式的查询:

  SELECT ... FROM ... WHERE 
`column1` IN ....

(`登录列名),第二个生成一个查询形式:

  SELECT ... FROM。 .. WHERE 
'column1'IN ....

('有没有人有任何建议如何制定过滤条件使其工作?

解决方案

它与SQL无关。 R中的这个例子不起作用:

  df<  -  data.frame(
v1 = sample ,10,replace = TRUE),
v2 = sample(5,10,replace = TRUE)

df%>%filter_(〜v1== 1)

它不起作用,因为您需要传递到 filter _ 表达式〜v1 == 1 - 不是表达式〜v1== 1



dplyr version> = 0.6



为了解决这个问题,只需使用引号运算符 quo 和dequoting操作符 !!

  library(dplyr)
which_column =(v1)
df%>%filter(!! which_column == 1)



dplyr version< 0.6



要解决问题,请使用lazyeval软件包中的 interp 函数。

  library(lazyeval)
filter_criteria< - interp(〜which_column == 1,which_column = as.name(v1))
df%>%filter_(filter_criteria)


I'd like to pull some data from a sql server with a dynamic filter. I'm using the great R package dplyr in the following way:

#Create the filter
filter_criteria = ~ column1 %in% some_vector
#Connect to the database
connection <- src_mysql(dbname <- "mydbname", 
             user <- "myusername", 
             password <- "mypwd", 
             host <- "myhost") 
#Get data
data <- connection %>%
 tbl("mytable") %>% #Specify which table
 filter_(.dots = filter_criteria) %>% #non standard evaluation filter
 collect() #Pull data

This piece of code works fine but now I'd like to loop it somehow on all the columns of my table, thus I'd like to write the filter as:

#Dynamic filter
i <- 2 #With a loop on this i for instance
which_column <- paste0("column",i)
filter_criteria <- ~ which_column %in% some_vector

And then reapply the first code with the updated filter.

Unfortunately this approach doesn't give the expected results. In fact it does not give any error but doesn't even pull any result into R. In particular, I looked a bit into the SQL query generated by the two pieces of code and there is one important difference.

While the first, working, code generates a query of the form:

SELECT ... FROM ... WHERE 
`column1` IN ....

(` sign in the column name), the second one generates a query of the form:

SELECT ... FROM ... WHERE 
'column1' IN ....

(' sign in the column name)

Does anyone have any suggestion on how to formulate the filtering condition to make it work?

解决方案

It's not really related to SQL. This example in R does not work either:

df <- data.frame(
     v1 = sample(5, 10, replace = TRUE),
     v2 = sample(5,10, replace = TRUE)
)
df %>% filter_(~ "v1" == 1)

It does not work because you need to pass to filter_ the expression ~ v1 == 1 — not the expression ~ "v1" == 1.

dplyr version >= 0.6

To solve the problem, simply use the quoting operator quo and the dequoting operator !!

library(dplyr)
which_column = quot(v1)
df %>% filter(!!which_column == 1)

dplyr version < 0.6

To solve the problem, use the function interp from the lazyeval package.

library(lazyeval)
filter_criteria <- interp(~ which_column == 1, which_column = as.name("v1"))
df %>% filter_(filter_criteria)

这篇关于dplyr过滤器中的非标准评估(NSE)从MySQL中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆