pandas -每经度/纬度的数据组/箱 [英] Pandas - Group/bins of data per longitude/latitude

查看:122
本文介绍了 pandas -每经度/纬度的数据组/箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆如下的地理数据. 我想按经度.2度和纬度.2度的bin对数据进行分组.

I have a bunch of geographical data as below. I would like to group the data by bins of .2 degrees in longitude AND .2 degree in latitude.

对于纬度或经度而言,这都是微不足道的,那么对两个变量进行此操作最合适的是什么?

While it is trivial to do for either latitude or longitude, what is the most appropriate of doing this for both variables?

|User_ID  |Latitude  |Longitude|Datetime           |u    |v    |
|---------|----------|---------|-------------------|-----|-----|
|222583401|41.4020375|2.1478710|2014-07-06 20:49:20|0.3  | 0.2 |
|287280509|41.3671346|2.0793115|2013-01-30 09:25:47|0.2  | 0.7 |
|329757763|41.5453577|2.1175164|2012-09-25 08:40:59|0.5  | 0.8 |
|189757330|41.5844998|2.5621569|2013-10-01 11:55:20|0.4  | 0.4 |
|624921653|41.5931846|2.3030671|2013-07-09 20:12:20|1.2  | 1.4 |
|414673119|41.5550136|2.0965829|2014-02-24 20:15:30|2.3  | 0.6 |
|414673119|41.5550136|2.0975829|2014-02-24 20:16:30|4.3  | 0.7 |
|414673119|41.5550136|2.0985829|2014-02-24 20:17:30|0.6  | 0.9 |

到目前为止,我所做的是创建了2个线性空间:

So far what I have done is created 2 linear spaces:

lonbins = np.linspace(df.Longitude.min(), df.Longitude.max(), 10) 
latbins = np.linspace(df.Latitude.min(), df.Latitude.max(), 10)

然后我可以使用以下方式进行分组:

Then I can groupBy using:

groups = df.groupby(pd.cut(df.Longitude, lonbins))

然后,我显然可以遍历各个组以创建第二级.我的目标是对每个组进行统计分析,并可能将它们显示在地图上,但看起来并不方便.

I could then obviously iterate over the groups to create a second level. My goal being to do statistical analysis on each of the group and possibly display them on a map it does not look very handy.

bucket = {}
for name, group in groups: 
    print name bucket[name] = group.groupby(pd.cut(group.Latitude, latbins))

例如,我想做一个热图,显示每个latlon框的行数,显示每个latlon框的速度分布,...

For example I would like to do a heatmap which would display the number of rows per latlon box, display distribution of speed in each of the latlon boxes, ...

推荐答案

如何?

step = 0.2
to_bin = lambda x: np.floor(x / step) * step
df["latbin"] = df.Latitude.map(to_bin)
df["lonbin"] = df.Longitude.map(to_bin)
groups = df.groupby(("latbin", "lonbin"))

这篇关于 pandas -每经度/纬度的数据组/箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆