从pandas数据框到元组(用于Haversine模块) [英] From pandas dataframe to tuples (for haversine module)
问题描述
我有一个熊猫数据框my_df
,其中包含以下列:
I have a pandas dataframe my_df
with the following columns :
id lat1 lon1 lat2 lon2
1 45 0 41 3
2 40 1 42 4
3 42 2 37 1
基本上,我想执行以下操作:
Basically, I'd like to do the following :
import haversine
haversine.haversine((45, 0), (41, 3)) # just to show syntax of haversine()
> 507.20410687342115
# what I'd like to do
my_df["dist"] = haversine.haversine((my_df["lat1"], my_df["lon1"]),(my_df["lat2"], my_df["lon2"]))
TypeError:无法将系列转换为<类'float'>
TypeError: cannot convert the series to < class 'float' >
使用此,我尝试了以下操作:
Using this, I tried the following :
my_df['dist'] = haversine.haversine(
list(zip(*[my_df[['lat1','lon1']][c].values.tolist() for c in my_df[['lat1','lon1']]]))
,
list(zip(*[my_df[['lat2','lon2']][c].values.tolist() for c in my_df[['lat2','lon2']]]))
)
haversine中的文件"blabla \ lib \ site-packages \ haversine__init __.py",第20行 lat1,lng1 = point1
File "blabla\lib\site-packages\haversine__init__.py", line 20, in haversine lat1, lng1 = point1
ValueError:太多值无法解包(预期2)
ValueError: too many values to unpack (expected 2)
关于我做错了什么/如何实现自己想要的一切的任何想法?
Any idea of what I'm doing wrong / how I can achieve what I want ?
推荐答案
将apply
与axis=1
结合使用:
my_df["dist"] = my_df.apply(lambda row : haversine.haversine((row["lat1"], row["lon1"]),(row["lat2"], row["lon2"])), axis=1)
要在每一行上调用Haversine函数,该函数可以理解标量值,而不是类似数组的值,因此会出现错误.通过用axis=1
调用apply
,您可以逐行进行迭代,这样我们就可以访问每个列值,并以该方法期望的形式传递这些值.
To call the haversine function on each row, the function understands scalar values, not array like values hence the error. By calling apply
with axis=1
, you iterate row-wise so we can then access each column value and pass these in the form that the method expects.
我也不知道有什么区别,但是有一个向量化的 Haversine公式的版本
Also I don't know what the difference is but there is a vectorised version of the haversine formula
这篇关于从pandas数据框到元组(用于Haversine模块)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!