在数据库的dplyr过滤器函数中传递SQL函数 [英] Pass SQL functions in dplyr filter function on database
问题描述
我正在使用 dplyr
的自动SQL后端从数据库表中查询子表。例如,
I'm using dplyr
's automatic SQL backend to query subtable from a database table. E.g.
my_tbl <- tbl(my_db, "my_table")
其中数据库中的 my_table
看起来像
where my_table
in the database looks like
batch_name value
batch_A_1 1
batch_A_2 2
batch_A_2 3
batch_B_1 8
batch_B_2 9
...
我只想要 batch_A _#
中的数据,不论数量。
I just want the data from batch_A_#
, regardless of the number.
如果我是用SQL编写的,则可以使用
If I were writing this in SQL, I could use
select * where batch_name like 'batch_A_%'
如果我在R中编写此代码,则可以使用几种获取方法: grepl()
,%in%
或 str_detect()
If I were writing this in R, I could use a few ways to get this: grepl()
, %in%
, or str_detect()
# option 1
subtable <- my_tbl %>% select(batch_name, value) %>%
filter(grepl('batch_A_', batch_name, fixed = T))
# option 2
subtable <- my_tbl %>% select(batch_name, value) %>%
filter(str_detect(batch_name, 'batch_A_'))
所有这些都会产生以下Postgres错误:提示:没有函数与给定的名称和参数类型匹配。您可能需要添加显式类型强制转换
All of these give the following Postgres error: HINT: No function matches the given name and argument types. You might need to add explicit type casts
因此,如何传递SQL字符串函数或匹配函数以帮助生成生成的dplyr SQL查询能够在过滤器
中使用更灵活的功能?
So, how do I pass in SQL string functions or matching functions to help make the generated dplyr SQL query able to use a more flexible range of functions in filter
?
(仅供参考, %in%
函数确实有效,但需要列出所有可能的值,可以将其与 paste
组合以创建列表,但不可以
(FYI the %in%
function does work, but requires listing out all possible values. This would be okay combined with paste
to make a list, but does not work in a more general regex case)
推荐答案
仅 dplyr
的解决方案是
tbl(my_con, "my_table") %>%
filter(batch_name %like% "batch_A_%") %>%
collect()
完整 reprex :
suppressPackageStartupMessages({
library(dplyr)
library(dbplyr)
library(RPostgreSQL)
})
my_con <-
dbConnect(
PostgreSQL(),
user = "my_user",
password = "my_password",
host = "my_host",
dbname = "my_db"
)
my_table <- tribble(
~batch_name, ~value,
"batch_A_1", 1,
"batch_A_2", 2,
"batch_A_2", 3,
"batch_B_1", 8,
"batch_B_2", 9
)
copy_to(my_con, my_table)
tbl(my_con, "my_table") %>%
filter(batch_name %like% "batch_A_%") %>%
collect()
#> # A tibble: 3 x 2
#> batch_name value
#> * <chr> <dbl>
#> 1 batch_A_1 1
#> 2 batch_A_2 2
#> 3 batch_A_2 3
dbDisconnect(my_con)
#> [1] TRUE
之所以有效,是因为dplyr的任何函数都不知道如何使用
翻译将按原样传递,请参见
?dbplyr ::
This works because any functions that dplyr doesn't know how to
translate will be passed along as is, see
?dbplyr::translate\_sql
.
帽子提示到 @ PaulRougieux 的最新评论
此处
Hat-tip to @PaulRougieux for his recent comment here
这篇关于在数据库的dplyr过滤器函数中传递SQL函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!