两个 pandas 数据框之间的欧几里德距离 [英] Euclidean distance between two pandas dataframes

查看:40
本文介绍了两个 pandas 数据框之间的欧几里德距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框:

df1 形式

user_id  | x_coord  | y_coord
 214         -55.2      22.1
 214         -55.2      22.1
 214         -55.2      22.1
...

df2,形式相同,但用户不同:

and df2, of the same form, but with different users:

user_id  | x_coord  | y_coord
 512         -15.2      19.1
 362          65.1      71.4
 989         -84.8      13.7
...

这个想法是我想找到 df1 中的用户和 df2 中的所有用户之间的欧几里德距离.为此,我需要能够根据最后两列计算两个数据帧之间的欧几里德距离,以便找出第二个数据帧中哪些用户最接近用户 214.

The idea is that I want to find the Euclidean distance between the user in df1 and all the users in df2. For this, I need to be able to compute the Euclidean distance between the two dataframes, based on the last two column, in order to find out which are the closest users in the second dataframe to user 214.

我找到了这个答案但它是不是我需要的,因为我的两个数据帧具有相同的形状,我需要以每行的方式计算距离:

I found this answer but it is not what I need, as my two dataframes have equal shapes and I need the distance computed in a per-row manner:

Euclidean_Distance_i(row_i_df1, row_i_df2)

并将所有这些距离保存在与这些数据帧长度相同的列表中.

and save all these distances in a list that is the same length as these dataframes.

推荐答案

尝试:

def Euclidean_Dist(df1, df2, cols=['x_coord','y_coord']):
    return np.linalg.norm(df1[cols].values - df2[cols].values,
                   axis=1)

测试:

df1 = pd.DataFrame({'user_id':[214,214,214],
                'x_coord':[-55.2,-55.2,-55.2],
                'y_coord':[22.1,22.1,22.1]})

df2 = pd.DataFrame({'user_id':[512, 362, 989],
                    'x_coord':[-15.2, 65.1, -84.8],
                    'y_coord':[19.1, 71.4, 13.7]})

Euclidean_Dist(df1, df2)

输出:

array([ 40.11234224, 130.0099227 ,  30.76881538])

这篇关于两个 pandas 数据框之间的欧几里德距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆