dplyr for SQL可以在过滤器中将== NA转换为IS NULL吗? [英] Can dplyr for SQL translate == NA to IS NULL in filters?

查看:84
本文介绍了dplyr for SQL可以在过滤器中将== NA转换为IS NULL吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用dplyr查询SQL数据库,并匹配提供的参数.

I'm trying to use dplyr to query a SQL database, matching on provided arguments.

  id <- tbl(conn, "My_Table") %>%
    filter(Elem1 == elem1 & Elem2 == elem2 & Elem3 == elem3) %>%
    select(Id) %>%
    collect()

但是, elem1 elem2 elem3 中的任何一个都可能是NA.理想情况下,我希望查询将它们转换为SQL IS NULL 语句.

However, it's possible that any of elem1, elem2, or elem3 might be NA. Ideally, I'd like the query to translate them to the SQL IS NULL statement.

例如,如果 elem1 为1, elem2 为NA,而 elem3 为3,我希望翻译后的查询为:

For example, if elem1 is 1, elem2 is NA, and elem3 is 3, I'd like the translated query to be:

SELECT Id FROM My_Table WHERE Elem1 == 1 AND Elem2 IS NULL AND Elem3 == 3

但是,我上面的代码将where子句转换为 ... AND Elem2 == NULL ... ,这显然不能满足我的要求.有解决这个问题的好方法吗?

However, my code above converts the where clause to ... AND Elem2 == NULL ... which obviously doesn't do what I want. Is there a nice way to solve this problem?

推荐答案

假设您在SQL服务器中,则可以使用 COALESCE 绕过它,如下所示:

Assuming you are in SQL-server you can bypass this using COALESCE like so:

filler_value = -1

id <- tbl(conn, "My_Table") %>%
    mutate(Elem1 = COALESCE(Elem1, filler_value),
           Elem2 = COALESCE(Elem2, filler_value),
           Elem3 = COALESCE(Elem3, filler_value)) %>%
    filter(Elem1 == COALESCE(elem1, filler_value),
           Elem2 == COALESCE(elem2, filler_value),
           Elem3 == COALESCE(elem3, filler_value)) %>%
    select(Id) %>%
    collect()

选择 filler_value 的位置,使其具有与数据集列相同的数据类型(文本/数字/日期),但不是当前出现在数据集列中的值.

Where filler_value is chosen so that it is of the same data type (text/numeric/date) as your dataset columns, but is not a value that presently appears in your dataset columns.

COALESCE 函数从其参数列表返回第一个非空值.因此,我们首先用占位符替换 Elem _ 列中的 NULL ,然后我们将 elem _ 中的 NULL 替换为具有相同占位符的值.因此,标准的 == 比较是有意义的.

The COALESCE function returns the first non-null value from its list of arguments. So first we replace NULL in the Elem_ columns with a place holder, and then we replace NULL in the elem_ values with the same placeholder. Hence a standard == comparison makes sense.

此处的主要思想之一是,由于 COALESCE 没有定义R到SQL的转换,因此当R代码转换为SQL时它会离开.参见问题以获取更多详细信息/替代.

One of the key ideas here, is that as COALESCE does not have an R to SQL translation defined, it gets left when the R code is translated to SQL. See this question for more details/an alterantive.

这篇关于dplyr for SQL可以在过滤器中将== NA转换为IS NULL吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆