dplyr for SQL可以在过滤器中将== NA转换为IS NULL吗? [英] Can dplyr for SQL translate == NA to IS NULL in filters?
问题描述
我正在尝试使用dplyr查询SQL数据库,并匹配提供的参数.
I'm trying to use dplyr to query a SQL database, matching on provided arguments.
id <- tbl(conn, "My_Table") %>%
filter(Elem1 == elem1 & Elem2 == elem2 & Elem3 == elem3) %>%
select(Id) %>%
collect()
但是, elem1
, elem2
或 elem3
中的任何一个都可能是NA.理想情况下,我希望查询将它们转换为SQL IS NULL
语句.
However, it's possible that any of elem1
, elem2
, or elem3
might be NA. Ideally, I'd like the query to translate them to the SQL IS NULL
statement.
例如,如果 elem1
为1, elem2
为NA,而 elem3
为3,我希望翻译后的查询为:
For example, if elem1
is 1, elem2
is NA, and elem3
is 3, I'd like the translated query to be:
SELECT Id FROM My_Table WHERE Elem1 == 1 AND Elem2 IS NULL AND Elem3 == 3
但是,我上面的代码将where子句转换为 ... AND Elem2 == NULL ...
,这显然不能满足我的要求.有解决这个问题的好方法吗?
However, my code above converts the where clause to ... AND Elem2 == NULL ...
which obviously doesn't do what I want. Is there a nice way to solve this problem?
推荐答案
假设您在SQL服务器中,则可以使用 COALESCE
绕过它,如下所示:
Assuming you are in SQL-server you can bypass this using COALESCE
like so:
filler_value = -1
id <- tbl(conn, "My_Table") %>%
mutate(Elem1 = COALESCE(Elem1, filler_value),
Elem2 = COALESCE(Elem2, filler_value),
Elem3 = COALESCE(Elem3, filler_value)) %>%
filter(Elem1 == COALESCE(elem1, filler_value),
Elem2 == COALESCE(elem2, filler_value),
Elem3 == COALESCE(elem3, filler_value)) %>%
select(Id) %>%
collect()
选择 filler_value
的位置,使其具有与数据集列相同的数据类型(文本/数字/日期),但不是当前出现在数据集列中的值.
Where filler_value
is chosen so that it is of the same data type (text/numeric/date) as your dataset columns, but is not a value that presently appears in your dataset columns.
COALESCE
函数从其参数列表返回第一个非空值.因此,我们首先用占位符替换 Elem _
列中的 NULL
,然后我们将 elem _
中的 NULL
替换为具有相同占位符的值.因此,标准的 ==
比较是有意义的.
The COALESCE
function returns the first non-null value from its list of arguments. So first we replace NULL
in the Elem_
columns with a place holder, and then we replace NULL
in the elem_
values with the same placeholder. Hence a standard ==
comparison makes sense.
此处的主要思想之一是,由于 COALESCE
没有定义R到SQL的转换,因此当R代码转换为SQL时它会离开.参见此问题以获取更多详细信息/替代.
One of the key ideas here, is that as COALESCE
does not have an R to SQL translation defined, it gets left when the R code is translated to SQL. See this question for more details/an alterantive.
这篇关于dplyr for SQL可以在过滤器中将== NA转换为IS NULL吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!