使用BallTree查找距离每个商店最近的站点 [英] Finding nearest station to each shop using BallTree

查看:30
本文介绍了使用BallTree查找距离每个商店最近的站点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据集,一个带有英国坐标的商店列表和一个带有坐标的火车站。

我正在使用BallTree来获得距离每个商店最近的站点,使用来自该网站的代码,并且我已经适当地交换了我的数据帧。

https://automating-gis-processes.github.io/site/notebooks/L3/nearest-neighbor-faster.html

编码:

import pandas as pd
import numpy as np
import geopandas as gpd

from sklearn.neighbors import BallTree

df_pocs = pd.read_csv(r'C:UsersFLETCHWIDesktopXXshops.csv', encoding = "ISO-8859-1", engine='python')

df_stations = pd.read_csv(r'C:UsersFLETCHWIDesktopxxuk_stations.csv', encoding = "ISO-8859-1", engine='python')


gdf_pocs = gpd.GeoDataFrame(
    df_pocs, geometry=gpd.points_from_xy(df_pocs.longitude, df_pocs.latitude))

gdf_stations = gpd.GeoDataFrame(
    df_stations, geometry=gpd.points_from_xy(df_stations.longitude, df_stations.latitude))



def get_nearest(src_points, candidates, k_neighbors=1):
    """Find nearest neighbors for all source points from a set of candidate points"""

    # Create tree from the candidate points
    tree = BallTree(candidates, leaf_size=15, metric='haversine')

    # Find closest points and distances
    distances, indices = tree.query(src_points, k=k_neighbors)

    # Transpose to get distances and indices into arrays
    distances = distances.transpose()
    indices = indices.transpose()

    # Get closest indices and distances (i.e. array at index 0)
    # note: for the second closest points, you would take index 1, etc.
    closest = indices[0]
    closest_dist = distances[0]

    # Return indices and distances
    return (closest, closest_dist)


def nearest_neighbor(left_gdf, right_gdf, return_dist=False):
    """
    For each point in left_gdf, find closest point in right GeoDataFrame and return them.

    NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).
    """

    left_geom_col = left_gdf.geometry.name
    right_geom_col = right_gdf.geometry.name

    # Ensure that index in right gdf is formed of sequential numbers
    right = right_gdf.copy().reset_index(drop=True)

    # Parse coordinates from points and insert them into a numpy array as RADIANS
    left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())
    right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())

    # Find the nearest points
    # -----------------------
    # closest ==> index in right_gdf that corresponds to the closest point
    # dist ==> distance between the nearest neighbors (in meters)

    closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)

    # Return points from right GeoDataFrame that are closest to points in left GeoDataFrame
    closest_points = right.loc[closest]

    # Ensure that the index corresponds the one in left_gdf
    closest_points = closest_points.reset_index(drop=True)

    # Add distance if requested
    if return_dist:
        # Convert to meters from radians
        earth_radius = 6371000  # meters
        closest_points['distance'] = dist * earth_radius

    return closest_points

# Find closest public transport stop for each building and get also the distance based on haversine distance
# Note: haversine distance which is implemented here is a bit slower than using e.g. 'euclidean' metric
# but useful as we get the distance between points in meters
closest_stations = nearest_neighbor(gdf_pocs, gdf_stations, return_dist=True)

运行代码后,它为我拥有的每个商店返回相同的站点。不过,我希望它能为每家商店找到最近的站点和距离。

如有任何帮助,谢谢!

推荐答案

我对函数进行了一些测试,确实需要颠倒经度/经度才能工作。

注意警告:

NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).

因此,在定义点时,简单更改

gdf_pocs = gpd.GeoDataFrame(
    df_pocs, geometry=gpd.points_from_xy(df_pocs.longitude, df_pocs.latitude))

gdf_stations = gpd.GeoDataFrame(
    df_stations, geometry=gpd.points_from_xy(df_stations.longitude, df_stations.latitude))

gdf_pocs = gpd.GeoDataFrame(
    df_pocs, geometry=gpd.points_from_xy(df_pocs.latitude, df_pocs.longitude))

gdf_stations = gpd.GeoDataFrame(
    df_stations, geometry=gpd.points_from_xy(df_stations.latitude, df_stations.longitude))

这篇关于使用BallTree查找距离每个商店最近的站点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆