如何执行“(df1& not df2)"数据框在大 pandas 中合并? [英] How to do "(df1 & not df2)" dataframe merge in pandas?

查看:188
本文介绍了如何执行“(df1& not df2)"数据框在大 pandas 中合并?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个pandas数据帧df1&具有常见列/键(x,y)的df2.

I have 2 pandas dataframes df1 & df2 with common columns/keys (x,y).

我想对键(x,y)进行(df1& not df2)"合并,这意味着我希望我的代码返回仅包含df1& ;不在df2中.

I want to merge do a "(df1 & not df2)" kind of merge on keys (x,y), meaning I want my code to return a dataframe containing rows with (x,y) only in df1 & not in df2.

SAS具有等效功能

data final;
merge df1(in=a) df2(in=b);
by x y;
if a & not b;
run;

谁能优雅地在熊猫中复制相同的功能? 如果我们可以在merge()中指定how ="left-right",那就太好了.

Who to replicate the same functionality in pandas elegantly? It would have been great if we can specify how="left-right" in merge().

推荐答案

我刚刚升级到10天前发布的版本0.17.0 RC1. 刚刚发现pd.merge()在此新发行版中有一个新的参数,称为indicator = True,可以以Pandonic方式实现这一目标!

I just upgraded to version 0.17.0 RC1 which was released 10 days ago. Just found out that pd.merge() have new argument in this new release called indicator=True to acheive this in pandonic way!!

df=pd.merge(df1,df2,on=['x','y'],how="outer",indicator=True)
df=df[df['_merge']=='left_only']

指示符:在输出数据帧中添加一列称为_merge,其中包含有关每一行源的信息. _merge是分类类型的,对于合并键仅出现在"left"数据帧中的观测值,其观察值取为left_only;对于合并键仅出现在"right"数据帧中的观测值,则取值为right_only;如果在两个观测值中都找到了观察值的合并键,则取值为

indicator: Add a column to the output DataFrame called _merge with information on the source of each row. _merge is Categorical-type and takes on a value of left_only for observations whose merge key only appears in 'left' DataFrame, right_only for observations whose merge key only appears in 'right' DataFrame, and both if the observation’s merge key is found in both.

http://pandas-docs.github.io/pandas-docs-travis/merging.html#database-style-dataframe-joining-merging

这篇关于如何执行“(df1& not df2)"数据框在大 pandas 中合并?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆