如何在 pandas 数据框中将欧几里德距离函数应用于groupby对象? [英] How to apply euclidean distance function to a groupby object in pandas dataframe?

查看:90
本文介绍了如何在 pandas 数据框中将欧几里德距离函数应用于groupby对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组对象及其随时间的位置.我想获得每个时间点对象之间的平均距离.数据帧示例如下:

I have a set of objects and their positions over time. I would like to get the average distance between objects for each time point. An example dataframe is as follows:

time = [0, 0, 0, 1, 1, 2, 2]
x = [216, 218, 217, 280, 290, 130, 132]
y = [13, 12, 12, 110, 109, 3, 56]
car = [1, 2, 3, 1, 3, 4, 5]
df = pd.DataFrame({'time': time, 'x': x, 'y': y, 'car': car})
df

             x       y      car
     time
      0     216     13       1
      0     218     12       2
      0     217     12       3
      1     280     110      1
      1     290     109      3
      2     130     3        4
      2     132     56       5

我想要的最终结果是:

df2

              average distance
              between cars       
     time
      0           1.55     
      1           10.05     
      2           53.04    

关于如何进行的任何想法?我一直在尝试将scipy.spatial.distance函数应用于数据框,但是我不确定如何将其应用于df.groupby('time'),然后获取所有这些距离的平均值. 任何帮助表示赞赏!

any idea on how to proceed? I've been trying apply the scipy.spatial.distance function to the dataframe, but I'm not sure how to apply it to df.groupby('time'), and then get the mean value of all those distances. Any help appreciated!

推荐答案

您可以将点的数组传递到scipy.spatial.distaince.pdist,它将为i> j计算Xi和Xj之间的所有成对距离.然后取平均值.

You could pass an array of the points to scipy.spatial.distaince.pdist and it will calculate all pair-wise distances between Xi and Xj for i>j. Then take the mean.

import numpy as np
from scipy import spatial

df.groupby('time').apply(lambda x: spatial.distance.pdist(np.array(list(zip(x.x, x.y)))).mean())

输出:

time
0     1.550094
1    10.049876
2    53.037722
dtype: float64

这篇关于如何在 pandas 数据框中将欧几里德距离函数应用于groupby对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆