合并后有没有可用的_merge指示? [英] Is there a _merge indicator available after a merge?
问题描述
在 dplyr
中合并后,有没有办法获得相当于 _merge
指标变量的方法?
Is there a way to get the equivalent of a _merge
indicator variable after a merge in dplyr
?
与熊猫 指标= True
选项相似的东西,本质上告诉你如何合并(每个数据集有多少个匹配等)。
Something similar to Pandas' indicator = True
option that essentially tells you how the merge went (how many matches from each dataset, etc).
以下是 Pandas
import pandas as pd
df1 = pd.DataFrame({'key1' : ['a','b','c'], 'v1' : [1,2,3]})
df2 = pd.DataFrame({'key1' : ['a','b','d'], 'v2' : [4,5,6]})
match = df1.merge(df2, how = 'left', indicator = True)
这里,在左侧加入
之间 df1
和 df2
,您要立即知道 df1
中有多少行在 df2
,其中有多少没有
Here, after a left join
between df1
and df2
, you want to immediately know how many rows in df1
found a match in df2
and how many of them did not
match
Out[53]:
key1 v1 v2 _merge
0 a 1 4.0 both
1 b 2 5.0 both
2 c 3 NaN left_only
我可以列表这个合并
变量:
match._merge.value_counts()
Out[52]:
both 2
left_only 1
right_only 0
Name: _merge, dtype: int64
我没有看到任何选项可用,比如说,左加入 dplyr
I don't see any option available after a, say, left join in dplyr
key1 = c('a','b','c')
v1 = c(1,2,3)
key2 = c('a','b','d')
v2 = c(4,5,6)
df1 = data.frame(key1,v1)
df2 = data.frame(key2,v2)
> left_join(df1,df2, by = c('key1' = 'key2'))
key1 v1 v2
1 a 1 4
2 b 2 5
3 c 3 NA
我在这里遗漏了什么?
谢谢!
Am I missing something here? Thanks!
推荐答案
我们根据 inner_join
, anti_join
然后绑定 bind_rows
d1 <- inner_join(df1, df2, by = c('key1' = 'key2')) %>%
mutate(merge = "both")
bind_rows(d1, anti_join(df1, df2, by = c('key1' = 'key2')) %>%
mutate(merge = 'left_only'))
这篇关于合并后有没有可用的_merge指示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!