使用 Pandas 数据帧的半正弦距离计算“无法将系列转换为 <class 'float'>" [英] Haversine Distance Calc using Pandas Data Frame "cannot convert the series to <class 'float'>"

查看:86
本文介绍了使用 Pandas 数据帧的半正弦距离计算“无法将系列转换为 <class 'float'>"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Panda Dataframe 上使用Haversine calc.

Im trying to use the Haversine calc on a Panda Dataframe.

from math import radians, cos, sin, asin, sqrt
    
def haversine(lon1, lat1, lon2, lat2):
        
        # convert decimal degrees to radians 
        lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    
        # haversine formula 
        dlon = lon2 - lon1 
        dlat = lat2 - lat1 
        a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
        c = 2 * asin(sqrt(a)) 
       
        r = 3956
        
        return c * r

这在使用以下代码时有效:

This works when using the following code:

haversine(-73.9881286621093,40.7320289611816,-73.9901733398437,40.7566795349121)

但是,当我将它用于 Pandas DataFrame 时:

However, when I use it against a Pandas DataFrame as such:

train_data['Distance_Travelled'] = train_data.apply(lambda row: haversine(train_data['pickup_longitude'], train_data['pickup_latitude'], train_data['dropoff_longitude'], train_data['dropoff_latitude']), axis=1)

我收到以下错误.

"cannot convert the series to <class 'float'>"

我尝试了多种投射方式,但每次尝试都会导致相同的错误.我知道数学期待浮动,但我不明白为什么 Pandas 系列不能被转换为浮动.

I've tried numerous ways of casting but each attempt results in the same error. I know that math is expecting float, but I don't understand why the Pandas series can't be cast as a float.

需要进行哪些编辑才能使其工作?为什么?

What edit needs to be made for it to work and why?

推荐答案

不要使用 apply,因为它不是矢量化的.另外,使用 numpy 中的矢量化函数:

Don't use apply since it is not vectorized. Also, use the vectorized functions from numpy:

def haversine(lon1, lat1, lon2, lat2):
    lon1, lat1, lon2, lat2 = np.deg2rad([lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
    c = 2 * np.asin(np.sqrt(a)) 

    r = 3956

    return c * r

train_data['Distance_Travelled'] = haversine(train_data['pickup_longitude'], 
                                             train_data['pickup_latitude'], 
                                             train_data['dropoff_longitude'], 
                                             train_data['dropoff_latitude']
                                            )

这篇关于使用 Pandas 数据帧的半正弦距离计算“无法将系列转换为 &lt;class 'float'&gt;"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆