比较多列以获取两个 pandas 数据帧中不同的行 [英] Compare Multiple Columns to Get Rows that are Different in Two Pandas Dataframes

查看:90
本文介绍了比较多列以获取两个 pandas 数据帧中不同的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框:

df1=
    A    B   C
0   A0   B0  C0
1   A1   B1  C1
2   A2   B2  C2

df2=
    A    B   C
0   A2   B2  C10
1   A1   B3  C11
2   A9   B4  C12

,我想基于一两列(或更多列)在df1中找到未在df2中找到的行.因此,如果我仅比较列"A",则在df2中找不到df1的以下行(请注意,"b"列和"C"列未用于df1和df2之间的比较)

and I want to find rows in df1 that are not found in df2 based on one or two columns (or more columns). So, if I only compare column 'A' then the following rows from df1 are not found in df2 (note that column 'B' and column 'C' are not used for comparison between df1 and df2)

    A    B   C
0   A0   B0  C0

我想返回一个系列

0   False
1   True
2   True

或者,如果仅比较列"A"和"B",则在df2中找不到df1的以下行(请注意,列"C"未用于df1和df2之间的比较)

Or, if I only compare column 'A' and column 'B' then the following rows from df1 are not found in df2 (note that column 'C' is not used for comparison between df1 and df2)

    A    B   C
0   A0   B0  C0
1   A1   B1  C1

我想返回一个序列

0   False
1   False
2   True

我知道如何使用集合来完成此任务,但我正在寻找一种简单的Pandas方式来完成此任务.

I know how to accomplish this using sets but I am looking for a straightforward Pandas way of accomplishing this.

推荐答案

理想情况下,人们希望能够只使用〜df1 [COLS] .isin(df2 [COLS])作为掩码,但这需要索引标签匹配( https://pandas.pydata.org /pandas-docs/stable/generation/pandas.DataFrame.isin.html )

Ideally, one would like to be able to just use ~df1[COLS].isin(df2[COLS]) as a mask, but this requires index labels to match (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html)

这是使用.isin的简洁形式,但将第二个DataFrame转换为dict,因此索引标签不需要匹配:

Here is a succinct form that uses .isin but converts the second DataFrame to a dict so that index labels don't need to match:

COLS = ['A', 'B'] # or whichever columns to use for comparison

df1[~df1[COLS].isin(df2[COLS].to_dict(
    orient='list')).all(axis=1)]

这篇关于比较多列以获取两个 pandas 数据帧中不同的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆