通过坐标的 pandas 数据框查找单元格中的点 [英] Find points in cells through pandas dataframes of coordinates

查看:141
本文介绍了通过坐标的 pandas 数据框查找单元格中的点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通过两个熊猫数据框,我必须找到方块单元格中的哪些点,给定点坐标和单元格边界的坐标。
我正在调用 dfc 包含单元格的代码和边界坐标的数据框(我简化了问题,在实际分析中,我有一个大的网格,具有地理点和吨点检查):

 代码,minx,miny,maxx,maxy 
01,0.0,0.0,2.0,2.0
02,2.0,2.0,3.0,3.0

dfp 包含Id的数据框和点的坐标:

  Id,x,y 
0,1.5, 1.5
1,1.1,1.1
2,2.2,2.2
3,1.3,1.3
4,3.4,1.4
5,2.0,1.5

现在我想执行一个在dfc数据框中返回的搜索一个新的列(称为GridCode)的网格细胞应该是完全平坦的,所以我想通过以下方式进行分析:

  a = np 。$(
(dfp ['x']> dfc ['minx'])&
(dfp ['x']< dfc ['maxx'])&am p;
(dfp ['y']> dfc ['miny'])&
(dfp ['y']< dfc ['maxy']),
r2 ['Code'],
'na')

避免数据帧上的几个循环。数据帧的长度不一样。所得数据框应如下所示:

  Id xy GridCode 
0 0 1.5 1.5 01
1 1 1.1 1.1 01
2 2 2.2 2.2 02
3 3 1.3 1.3 01
4 4 3.4 1.4 na
5 5 2.0 1.5 na

提前感谢您的帮助!

解决方案

可能是一个更好的方法,但是由于这已经在那里坐了一会儿..



使用Pandas布尔索引来过滤dfc数据帧而不是np.where()

  def findGrid(dfp):
c = dfc [(dfp ['x']> dfc ['minx' ])&
(dfp ['x']< dfc ['maxx'])&
(dfp ['y']> dfc ['miny'])&
(dfp ['y']< dfc ['maxy'])]代码

如果len(c)== 0:
返回无
否则:
返回c.iat [0]

然后使用pandas apply()函数

  dfp ['GridCode'] = dfp.apply(findGrid,axis = 1)

将产生此

  Id xy GridCode 
0 0 1.5 1.5 1
1 1 1.1 1.1 1
2 2 2.2 2.2 2
3 3 1.3 1.3 1
4 4 3.4 1.4 NaN
5 5 2.0 1.5 NaN


I have to find which points are inside a grid of square cells, given the points coordinates and the coordinates of the bounds of the cells, through two pandas dataframes. I'm calling dfc the dataframe containing the code and the boundary coordinates of the cells (I simplify the problem, in the real analysis I have a big grid with geographical points and tons of points to check):

Code,minx,miny,maxx,maxy
01,0.0,0.0,2.0,2.0
02,2.0,2.0,3.0,3.0

and dfp the dataframe containing an Id and the coordinates of the points:

Id,x,y
0,1.5,1.5
1,1.1,1.1
2,2.2,2.2
3,1.3,1.3
4,3.4,1.4
5,2.0,1.5

Now I would like to perform a search returning in dfc dataframe a new column (called 'GridCode') of the grid in which the point is in. The cells should be perfectly squared, so I would like to perform an analysis through:

a = np.where(
            (dfp['x'] > dfc['minx']) &
            (dfp['x'] < dfc['maxx']) &
            (dfp['y'] > dfc['miny']) &
            (dfp['y'] < dfc['maxy']),
            r2['Code'],
            'na')

avoiding several loops on the dataframes. The lenghts of the dataframes are not the same. The resulting dataframe should be as follows:

   Id    x    y GridCode
0   0  1.5  1.5   01
1   1  1.1  1.1   01
2   2  2.2  2.2   02
3   3  1.3  1.3   01
4   4  3.4  1.4   na
5   5  2.0  1.5   na

Thanks in advance for your help!

解决方案

Probably a better way, but since this has been sitting out there for awhile..

Using Pandas boolean indexing to filter the dfc data frame instead of np.where()

def findGrid(dfp):  
    c = dfc[(dfp['x'] > dfc['minx']) &
            (dfp['x'] < dfc['maxx']) &
            (dfp['y'] > dfc['miny']) &
            (dfp['y'] < dfc['maxy'])].Code

    if len(c) == 0:        
        return None
    else:        
        return c.iat[0]

Then use the pandas apply() function

dfp['GridCode'] = dfp.apply(findGrid,axis=1)

Will yield this

    Id  x   y   GridCode
0   0   1.5 1.5 1
1   1   1.1 1.1 1
2   2   2.2 2.2 2
3   3   1.3 1.3 1
4   4   3.4 1.4 NaN
5   5   2.0 1.5 NaN

这篇关于通过坐标的 pandas 数据框查找单元格中的点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆