如何做“(df1 & not df2)"大 pandas 中的数据框合并? [英] How to do "(df1 & not df2)" dataframe merge in pandas?
问题描述
我有 2 个熊猫数据框 df1 &具有公共列/键 (x,y) 的 df2.
I have 2 pandas dataframes df1 & df2 with common columns/keys (x,y).
我想合并对键 (x,y) 进行(df1 & not df2)"类型的合并,这意味着我希望我的代码返回一个包含仅在 df1 & 中带有 (x,y) 的行的数据框;不在 df2 中.
I want to merge do a "(df1 & not df2)" kind of merge on keys (x,y), meaning I want my code to return a dataframe containing rows with (x,y) only in df1 & not in df2.
SAS 具有等效功能
data final;
merge df1(in=a) df2(in=b);
by x y;
if a & not b;
run;
谁来优雅地复制 Pandas 中的相同功能?如果我们可以在 merge() 中指定 how="left-right" 就好了.
Who to replicate the same functionality in pandas elegantly? It would have been great if we can specify how="left-right" in merge().
推荐答案
我刚刚升级到 10 天前发布的 0.17.0 RC1 版本.刚刚发现 pd.merge() 在这个名为 indicator=True 的新版本中有新的参数,可以以疯狂的方式实现这一点!!
I just upgraded to version 0.17.0 RC1 which was released 10 days ago. Just found out that pd.merge() have new argument in this new release called indicator=True to acheive this in pandonic way!!
df=pd.merge(df1,df2,on=['x','y'],how="outer",indicator=True)
df=df[df['_merge']=='left_only']
指示符:将一列添加到名为 _merge 的输出 DataFrame 中,其中包含有关每行源的信息._merge 是 Categorical 类型,对于合并键仅出现在左"DataFrame 中的观察值采用 left_only 值,对于合并键仅出现在right"DataFrame 中的观察值采用 right_only,如果在两者中都找到了观察值的合并键,则两者都采用.
indicator: Add a column to the output DataFrame called _merge with information on the source of each row. _merge is Categorical-type and takes on a value of left_only for observations whose merge key only appears in 'left' DataFrame, right_only for observations whose merge key only appears in 'right' DataFrame, and both if the observation’s merge key is found in both.
这篇关于如何做“(df1 & not df2)"大 pandas 中的数据框合并?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!