pandas :两个数据框的区别 [英] Pandas: Diff of two Dataframes
问题描述
我需要逐行比较两个大小不同的数据框,并打印出不匹配的行.让我们考虑以下两个:
I need to compare two dataframes of different size row-wise and print out non matching rows. Lets take the following two:
df1 = DataFrame({
'Buyer': ['Carl', 'Carl', 'Carl'],
'Quantity': [18, 3, 5, ]})
df2 = DataFrame({
'Buyer': ['Carl', 'Mark', 'Carl', 'Carl'],
'Quantity': [2, 1, 18, 5]})
在df2上按行打印并打印出不在df1中的行的最有效方法是什么,例如:
What is the most efficient way to row-wise over df2 and print out rows not in df1 e.g:
Buyer Quantity
Carl 2
Mark 1
重要:我不想有行:
Buyer Quantity
Carl 3
包含在差异中:
我已经尝试过: 比较两个不同长度的行,并为每行添加相等值的列 和在两个Pandas数据帧中输出差异并排-突出显示差异
I have already tried: Comparing two dataframes of different length row by row and adding columns for each row with equal value and Outputting difference in two Pandas dataframes side by side - highlighting the difference
但是这些与我的问题不匹配.
But these do not match with my problem.
谢谢
安迪
推荐答案
merge
the 2 dfs using method 'outer' and pass param indicator=True
this will tell you whether the rows are present in both/left only/right only, you can then filter the merged df after:
In [22]:
merged = df1.merge(df2, indicator=True, how='outer')
merged[merged['_merge'] == 'right_only']
Out[22]:
Buyer Quantity _merge
3 Carl 2 right_only
4 Mark 1 right_only
这篇关于 pandas :两个数据框的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!