阻止dplyr加入NA [英] Prevent dplyr from joining on NA's

查看:69
本文介绍了阻止dplyr加入NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想进行2 df的完全合并。令我惊讶的是,如果dplyr的默认行为是同时存在于两个df中,则它们会加入NA。有防止dplyr这样做的功能吗?

I'd like to do a full-join of 2 df's. To my surprise, dplyr's default behavior is to join on NA's if they exist in both df's. Is there a functionality to prevent dplyr from doing this?

以下是内部联接的示例:

Here's an example with inner join:

x <- data.frame(a = c(5, NA, 9), b = 1:3)
y <- data.frame(a = c(5, NA, 9), c = 4:6)
z <- dplyr::inner_join(x, y, by = 'a')

我希望z仅包含2条记录,而不是3条记录。理想情况下,我希望这样做而不必事先手动过滤掉NA的记录,然后将它们附加到结果中(

I would like z to contain only 2 records, not 3. Ideally, I want to do this without having to manually filter out the records with NA's beforehand and then append them to the result (since that seems clumsy).

推荐答案

您可以使用 na_matches =从不 。这是在新闻中的v.7.0版本,但我在文档中看不到它。默认值为 na_matches = na

You can use na_matches = "never". This is in the NEWS for v. 0.7.0 but I don't see it in the documentation. The default is na_matches = "na".

这将返回两行而不是三行:

This returns two rows instead of three:

dplyr::inner_join(x, y, by = 'a', na_matches = "never")

  a b c
1 5 1 4
2 9 3 6

这篇关于阻止dplyr加入NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆