比较两个数据框并获取最接近的匹配数据框 [英] compare two dataframes and get nearest matching dataframe
本文介绍了比较两个数据框并获取最接近的匹配数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
具有两个带有列的数据框
have two dataframes with columns
df1
name cell marks
tom 2 21862
df2
name cell marks passwd
tom 2 11111 2548
matt 2 158416 2483
2 21862 26846
如何比较df2与df1并获取最接近的匹配数据帧
How to compare df2 with df1 and get nearest matched data frames
expected_output:
expected_output:
df2
name cell marks passwd
tom 2 11111 2548
2 21862 26846
尝试合并
,但数据是动态的。在一种情况下,名称
可能会更改,而在另一种情况下,标记
可能会更改
tried merge
but data is dynamic. On one case name
might change and in another case marks
might change
推荐答案
您可以使用 pandas.merge
和选项 indicator = True
,对'both'
的结果进行过滤:
You can use pandas.merge
with the option indicator=True
, filtering the result for 'both'
:
import pandas as pd
df1 = pd.DataFrame([['tom', 2, 11111]], columns=["name", "cell", "marks"])
df2 = pd.DataFrame([['tom', 2, 11111, 2548],
['matt', 2, 158416, 2483]
], columns=["name", "cell", "marks", "passwd"])
def compare_dataframes(df1, df2):
"""Find rows which are similar between two DataFrames."""
comparison_df = df1.merge(df2,
indicator=True,
how='outer')
return comparison_df[comparison_df['_merge'] == 'both'].drop(columns=["_merge"])
print(compare_dataframes(df1, df2))
返回:
name cell marks passwd
0 tom 2 11111 2548
这篇关于比较两个数据框并获取最接近的匹配数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文