比较两个数据框并获取最接近的匹配数据框 [英] compare two dataframes and get nearest matching dataframe

查看:91
本文介绍了比较两个数据框并获取最接近的匹配数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有两个带有列的数据框

have two dataframes with columns

df1


name    cell     marks  

tom      2       21862


df2


name    cell    marks     passwd

tom      2       11111      2548

matt     2       158416      2483
         2       21862      26846

如何比较df2与df1并获取最接近的匹配数据帧

How to compare df2 with df1 and get nearest matched data frames

expected_output:

expected_output:

df2


name    cell    marks     passwd

tom      2       11111      2548
         2       21862      26846

尝试合并,但数据是动态的。在一种情况下,名称可能会更改,而在另一种情况下,标记可能会更改

tried merge but data is dynamic. On one case name might change and in another case marks might change

推荐答案

您可以使用 pandas.merge 和选项 indicator = True ,对'both'的结果进行过滤:

You can use pandas.merge with the option indicator=True, filtering the result for 'both':

import pandas as pd

df1 = pd.DataFrame([['tom', 2, 11111]], columns=["name", "cell", "marks"])

df2 = pd.DataFrame([['tom', 2, 11111, 2548],
                    ['matt', 2, 158416, 2483]
                    ], columns=["name", "cell", "marks", "passwd"])


def compare_dataframes(df1, df2):
    """Find rows which are similar between two DataFrames."""
    comparison_df = df1.merge(df2,
                              indicator=True,
                              how='outer')
    return comparison_df[comparison_df['_merge'] == 'both'].drop(columns=["_merge"])


print(compare_dataframes(df1, df2))

返回:

  name  cell  marks  passwd
0  tom     2  11111    2548

这篇关于比较两个数据框并获取最接近的匹配数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆