找到一个更简单的方法，以集群的2-D散射数据转换成栅格阵列数据 [英] Find a easier way to cluster 2-d scatter data into grid array data

查看：343 发布时间：2016/5/31 20:45:30 python arrays numpy matplotlib matplotlib-basemap

本文介绍了找到一个更简单的方法，以集群的2-D散射数据转换成栅格阵列数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经找到了一种方法来聚集点数据分散到结构化的2-D阵列（例如栅格功能）。我希望有一些更好的方法来实现这一目标。

我的作品

1。简介

1000点数据有没有whicn重新present一厂位于属性（LON，纬度，排放）的尺寸（X，Y）排放一定量的二氧化碳进入大气层

格网：predefine在20×20的形状的2-D阵列

在code转载这里：

  ####定义地图区域
XC1，XC2，YC1，YC2 = 113.49805889531724,115.5030664238035,37.39995194888143,38.789235929357105
地图底图=（llcrnrlon = XC1，llcrnrlat = YC1，urcrnrlon = XC2，urcrnrlat = YC2）####读取由它们的位置点数据和散点图
DF = pd.read_csv（xxxxx.csv）
PX，PY =地图（df.lon，df.lat）
map.scatter（PX，PY，颜色=红，S = 5，ZORDER = 3）#### predefine电网网络
lon_grid，lat_grid = np.linspace（XC1，xc2,21），np.linspace（YC1，yc2,21）
lon_x，lat_y = np.meshgrid（lon_grid，lat_grid）
网格= np.zeros（20 * 20）.reshape（20,20）
plt.pcolormesh（lon_x，lat_y，网格，CMAP =灰色，facecolor ='无'，edgecolor ='K'，ZORDER = 3）

2。我的目标

查找最近的网格点，每个工厂

排放数据添加到这个数字格

3。算法实现

3.1栅格网

注意：20×20的网格点分布在这个区域重新由蓝点psented $ P $。

3.2 KD树

查找每个红点最近的蓝点

  SH =（20 * 20,2）
网格= np.zeros（20 * 20 * 2）.reshape（* SH）sh_emission =（20 * 20）
grids_em = np.zeros（20 * 20）.reshape（sh_emission）K = 0
对于在范围Ĵ（0，yy.shape [0]，1）：
    对于i在范围（0，xx.shape [0]，1）：
        网格[K] = np.array（[lon_grid [I]，lat_grid [J]]）
        K + = 1T = KDTree（网格）X_DELTA =（lon_grid [2]  -  lon_grid [1]）
Y_DELTA =（lat_grid [2]  -  lat_grid [1]）
R = np.sqrt（X_DELTA ** 2 + Y_DELTA ** 2）对于i在范围（0，LEN（df.lon），1）：
    IDX = T.query_ball_point（[df.lon.iloc [I]，df.lat.iloc [Ⅰ]，R = R）
    ＃有一个以上的蓝点这有时会成立后，
    ＃所以我会计算厂（红点）之间的距离
    ＃并列出所有蓝点
    如果（IDX→1）：
        距离= []
        对于在范围K（0，LEN（IDX），1）：
            distance.append（np.sqrt（（df.lon.iloc [I]  - 网格[K] [0]）** 2+（df.lat.iloc [I]  - 网格[k]的[1]）** 2））
           pos_index = distance.index（分（距离））
           POS = IDX [pos_index]    ＃只发现1个点
    其他：
         POS = IDX
    grids_em [POS] + = df.so2 [I]

4。结果

  CO2 = grids_em.reshape（20,20）
plt.pcolormesh（lon_x，lat_y，CO2，CMAP = plt.cm.Spectral_r，ZORDER = 3）

5。我的问题

有人能指出这种方法的一些缺点或错误？

有一些算法，我的目标更一致？

非常感谢！

解决方案

有许多for循环在code，它不是numpy的方式。

先作一些样本数据：

 导入numpy的是NP
进口大熊猫作为PD
从scipy.spatial进口KDTree
进口pylab为PLXC1，XC2，YC1，YC2 = 113.49805889531724，115.5030664238035，37.39995194888143，38.789235929357105N = 1000
GSIZE = 20
的x，y = np.random.multivariate_normal（[（XC1 + XC2）* 0.5，（YC1 + YC2）* 0.5]，[[0.1，0.02]，[0.02，0.1]，大小= N）.T
值= np.ones（N）df_points = pd.DataFrame（{×：X，Y为：y，V：值}）

有关平等的空间网格，您可以使用 hist2d（）：

  pl.hist2d（df_points.x，df_points.y，权重= df_points.v，箱= 20，CMAP =狗尾草）;

下面是输出：

下面是code使用 KdTree ：

  X，Y = np.mgrid [x.min（）：x.max（）：GSIZE * 1J，y.min（）：y.max（）：GSIZE * 1J]格= np.c_ [X.ravel（），Y.ravel（）]
点= np.c_ [df_points.x，df_points.y]树= KDTree（网格）
DIST，指数= tree.query（点）grid_values = df_points.groupby（指数）.v.sum（）df_grid = pd.DataFrame（网格，列= [×，Y]）
df_grid [V] = grid_values无花果，AX = pl.subplots（figsize =（10,8））
ax.plot（df_points.x，df_points.yKX，α-= 0.2）
映射器= ax.scatter（df_grid.x，df_grid.y，C = df_grid.v，
                    CMAP =狗尾草
                    线宽= 0，
                    S = 100，标记=O）
pl.colorbar（映射器，斧斧=）;

的输出是：

I have figured out a method to cluster disperse point data into structured 2-d array(like rasterize function). And I hope there are some better ways to achieve that target.

My work

1. Intro

1000 point data has there dimensions of properties (lon, lat, emission) whicn represent one factory located at (x,y) emit certain amount of CO2 into atmosphere
grid network: predefine the 2-d array in the shape of 20x20

The code reproduced here:

#### define the map area
xc1,xc2,yc1,yc2 = 113.49805889531724,115.5030664238035,37.39995194888143,38.789235929357105       
map = Basemap(llcrnrlon=xc1,llcrnrlat=yc1,urcrnrlon=xc2,urcrnrlat=yc2)     

#### reading the point data and scatter plot by their position
df = pd.read_csv("xxxxx.csv")
px,py = map(df.lon, df.lat)       
map.scatter(px, py, color = "red", s= 5,zorder =3)      

#### predefine the grid networks      
lon_grid,lat_grid = np.linspace(xc1,xc2,21), np.linspace(yc1,yc2,21)
lon_x,lat_y = np.meshgrid(lon_grid,lat_grid)
grids = np.zeros(20*20).reshape(20,20)
plt.pcolormesh(lon_x,lat_y,grids,cmap =  'gray', facecolor = 'none',edgecolor = 'k',zorder=3)

2. My target

Finding the nearest grid point for each factory

Add the emission data into this grid number

3. Algorithm realization

3.1 Raster grid

note: 20x20 grid points are distributed in this area represented by blue dot.

3.2 KD-tree

Find the nearest blue dot of each red point

sh = (20*20,2)
grids = np.zeros(20*20*2).reshape(*sh)

sh_emission = (20*20)
grids_em = np.zeros(20*20).reshape(sh_emission)

k = 0
for j in range(0,yy.shape[0],1):
    for i in range(0,xx.shape[0],1):
        grids[k] = np.array([lon_grid[i],lat_grid[j]])
        k+=1

T = KDTree(grids)

x_delta = (lon_grid[2] - lon_grid[1])
y_delta = (lat_grid[2] - lat_grid[1])
R = np.sqrt(x_delta**2 + y_delta**2)

for i in range(0,len(df.lon),1):
    idx = T.query_ball_point([df.lon.iloc[i],df.lat.iloc[i]], r=R)
    # there are more than one blue dot which are founded sometimes,      
    # So I'll calculate the distances between the factory(red point)       
    # and all blue dots which are listed 
    if (idx > 1):
        distance = []
        for k in range(0,len(idx),1):
            distance.append(np.sqrt((df.lon.iloc[i] - grids[k][0])**2 + (df.lat.iloc[i] - grids[k][1])**2))
           pos_index = distance.index(min(distance))
           pos = idx[pos_index]

    # Only find 1 point
    else:
         pos = idx   
    grids_em[pos] += df.so2[i]

4. Result

co2 = grids_em.reshape(20,20)
plt.pcolormesh(lon_x,lat_y,co2,cmap =plt.cm.Spectral_r,zorder=3)

5. My question

Can someone point out some drawbacks or error of this method?
Is there some algorithms more aligned with my target?

Thanks a lot!

解决方案

There are many for-loop in your code, it's not the numpy way.

Make some sample data first:

import numpy as np
import pandas as pd
from scipy.spatial import KDTree
import pylab as pl

xc1, xc2, yc1, yc2 = 113.49805889531724, 115.5030664238035, 37.39995194888143, 38.789235929357105       

N = 1000
GSIZE = 20
x, y = np.random.multivariate_normal([(xc1 + xc2)*0.5, (yc1 + yc2)*0.5], [[0.1, 0.02], [0.02, 0.1]], size=N).T
value = np.ones(N)

df_points = pd.DataFrame({"x":x, "y":y, "v":value})

For equal space grids you can use hist2d():

pl.hist2d(df_points.x, df_points.y, weights=df_points.v, bins=20, cmap="viridis");

Here is the output:

Here is the code to use KdTree:

X, Y = np.mgrid[x.min():x.max():GSIZE*1j, y.min():y.max():GSIZE*1j]

grid = np.c_[X.ravel(), Y.ravel()]
points = np.c_[df_points.x, df_points.y]

tree = KDTree(grid)
dist, indices = tree.query(points)

grid_values = df_points.groupby(indices).v.sum()

df_grid = pd.DataFrame(grid, columns=["x", "y"])
df_grid["v"] = grid_values

fig, ax = pl.subplots(figsize=(10, 8))
ax.plot(df_points.x, df_points.y, "kx", alpha=0.2)
mapper = ax.scatter(df_grid.x, df_grid.y, c=df_grid.v, 
                    cmap="viridis", 
                    linewidths=0, 
                    s=100, marker="o")
pl.colorbar(mapper, ax=ax);

the output is:

这篇关于找到一个更简单的方法，以集群的2-D散射数据转换成栅格阵列数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

找到一个更简单的方法，以集群的2-D散射数据转换成栅格阵列数据 [英] Find a easier way to cluster 2-d scatter data into grid array data

问题描述

我的作品

1。简介

2。我的目标

3。算法实现

4。结果

5。我的问题

My work

1. Intro

2. My target

3. Algorithm realization

4. Result

5. My question

相关文章

Python最新文章

热门教程

热门工具

登录关闭

找到一个更简单的方法，以集群的2-D散射数据转换成栅格阵列数据 [英] Find a easier way to cluster 2-d scatter data into grid array data

问题描述

我的作品

1。简介

2。我的目标

3。算法实现

4。结果

5。我的问题

My work

1. Intro

2. My target

3. Algorithm realization

4. Result

5. My question

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭