Python Pandas - 找出两个数据框之间的差异 [英] Python Pandas - Find difference between two data frames

查看：36 发布时间：2021/12/3 8:34:15 python pandas dataframe

本文介绍了Python Pandas - 找出两个数据框之间的差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个数据框 df1 和 df2，其中 df2 是 df1 的子集.我如何获得一个新的数据帧 (df3)，这是两个数据帧之间的差异?

换句话说，一个包含 df1 中所有不在 df2 中的行/列的数据框?

解决方案

通过使用 drop_duplicates

pd.concat([df1,df2]).drop_duplicates(keep=False)

更新:

<块引用>

上述方法仅适用于那些本身还没有重复项的数据框.例如:

df1=pd.DataFrame({'A':[1,2,3,3],'B':[2,3,4,4]})df2=pd.DataFrame({'A':[1],'B':[2]})

它会输出如下，这是错误的

<块引用>

错误输出:

pd.concat([df1, df2]).drop_duplicates(keep=False)出[655]:甲乙1 2 3

<块引用>

正确的输出

输出[656]:甲乙1 2 32 3 43 3 4

<块引用>

如何实现?

方法一:在tuple

中使用isin

df1[~df1.apply(tuple,1).isin(df2.apply(tuple,1))]出[657]:甲乙1 2 32 3 43 3 4

方法二:merge与indicator

df1.merge(df2,indicator = True, how='left').loc[lambda x : x['_merge']!='both']出[421]:A B _合并1 2 3 left_only2 3 4 left_only3 3 4 left_only

I have two data frames df1 and df2, where df2 is a subset of df1. How do I get a new data frame (df3) which is the difference between the two data frames?

In other word, a data frame that has all the rows/columns in df1 that are not in df2?

解决方案

By using drop_duplicates

pd.concat([df1,df2]).drop_duplicates(keep=False)

Update :

The above method only works for those data frames that don't already have duplicates themselves. For example:

df1=pd.DataFrame({'A':[1,2,3,3],'B':[2,3,4,4]})
df2=pd.DataFrame({'A':[1],'B':[2]})

It will output like below , which is wrong

Wrong Output :

pd.concat([df1, df2]).drop_duplicates(keep=False)
Out[655]: 
   A  B
1  2  3

Correct Output

How to achieve that?

Method 1: Using isin with tuple

df1[~df1.apply(tuple,1).isin(df2.apply(tuple,1))]
Out[657]: 
   A  B
1  2  3
2  3  4
3  3  4

Method 2: merge with indicator

df1.merge(df2,indicator = True, how='left').loc[lambda x : x['_merge']!='both']
Out[421]: 
   A  B     _merge
1  2  3  left_only
2  3  4  left_only
3  3  4  left_only

这篇关于Python Pandas - 找出两个数据框之间的差异的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas - 找出两个数据框之间的差异 [英] Python Pandas - Find difference between two data frames

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas - 找出两个数据框之间的差异 [英] Python Pandas - Find difference between two data frames

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭