Python Numpy:基于坐标创建二维值数组 [英] Python numpy: create 2d array of values based on coordinates

查看:1091
本文介绍了Python Numpy:基于坐标创建二维值数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含3列的文件,其中前两列是坐标(x,y),第三列是对应于该位置的值(z).这是一个简短的示例:

I have a file containing 3 columns, where the first two are coordinates (x,y) and the third is a value (z) corresponding to that position. Here's a short example:

x y z
0 1 14
0 2 17
1 0 15
1 1 16
2 1 18
2 2 13

我想根据文件中x,y坐标从第三行创建2D值数组.我将每一列读为一个单独的数组,并使用numpy.meshgrid创建了x值和y值的网格,如下所示:

I want to create a 2D array of values from the third row based on their x,y coordinates in the file. I read in each column as an individual array, and I created grids of x values and y values using numpy.meshgrid, like this:

x = [[0 1 2]    and   y = [[0 0 0]
     [0 1 2]               [1 1 1]
     [0 1 2]]              [2 2 2]]

但是我是Python的新手,不知道如何生成第三个z值网格,如下所示:

but I'm new to Python and don't know how to produce a third grid of z values that looks like this:

z = [[Nan 15 Nan]
     [14  16  18]
     [17  Nan 13]]

0替换Nan也可以;我的主要问题是首先创建2D数组.预先感谢您的帮助!

Replacing Nan with 0 would be fine, too; my main problem is creating the 2D array in the first place. Thanks in advance for your help!

推荐答案

假定文件中的xy值直接对应于索引(如示例中所示),则可以执行类似的操作:

Assuming the x and y values in your file directly correspond to indices (as they do in your example), you can do something similar to this:

import numpy as np

x = [0, 0, 1, 1, 2, 2]
y = [1, 2, 0, 1, 1, 2]
z = [14, 17, 15, 16, 18, 13]

z_array = np.nan * np.empty((3,3))
z_array[y, x] = z

print z_array

哪个产量:

[[ nan  15.  nan]
 [ 14.  16.  18.]
 [ 17.  nan  13.]]

对于大型数组,这比在坐标上进行显式循环要快得多.

For large arrays, this will be much faster than the explicit loop over the coordinates.

如果您定期采样x& y点,则可以通过减去网格的角"(即x0y0),除以像元间隔并将其转换为整数来将它们转换为网格索引.然后,您可以使用上面的方法或其他任何答案.

If you have regularly sampled x & y points, then you can convert them to grid indices by subtracting the "corner" of your grid (i.e. x0 and y0), dividing by the cell spacing, and casting as ints. You can then use the method above or in any of the other answers.

作为一般示例:

i = ((y - y0) / dy).astype(int)
j = ((x - x0) / dx).astype(int)

grid[i,j] = z

但是,如果您的数据没有规则的间隔,则可以使用一些技巧.

However, there are a couple of tricks you can use if your data is not regularly spaced.

假设我们有以下数据:

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1977)
x, y, z = np.random.random((3, 10))

fig, ax = plt.subplots()
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)

我们要放入常规的10x10网格中:

That we want to put into a regular 10x10 grid:

为此,我们实际上可以使用/滥用np.histogram2d.代替计数,我们将其添加到单元格中的每个点的值.通过指定weights=z, normed=False最为简单.

We can actually use/abuse np.histogram2d for this. Instead of counts, we'll have it add the value of each point that falls into a cell. It's easiest to do this through specifying weights=z, normed=False.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1977)
x, y, z = np.random.random((3, 10))

# Bin the data onto a 10x10 grid
# Have to reverse x & y due to row-first indexing
zi, yi, xi = np.histogram2d(y, x, bins=(10,10), weights=z, normed=False)
zi = np.ma.masked_equal(zi, 0)

fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi, edgecolors='black')
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)

plt.show()

但是,如果我们有很多点,则某些垃圾箱将有一个以上的点. np.histogramweights参数只是添加值.在这种情况下,这可能不是您想要的.尽管如此,我们可以通过除以计数来获得每个单元格中落入的点的平均值.

However, if we have a large number of points, some bins will have more than one point. The weights argument to np.histogram simply adds the values. That's probably not what you want in this case. Nonetheless, we can get the mean of the points that fall in each cell by dividing by the counts.

例如,假设我们有50分:

So, for example, let's say we have 50 points:

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1977)
x, y, z = np.random.random((3, 50))

# Bin the data onto a 10x10 grid
# Have to reverse x & y due to row-first indexing
zi, yi, xi = np.histogram2d(y, x, bins=(10,10), weights=z, normed=False)
counts, _, _ = np.histogram2d(y, x, bins=(10,10))

zi = zi / counts
zi = np.ma.masked_invalid(zi)

fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi, edgecolors='black')
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)

plt.show()

点数非常多时,这种精确的方法会变慢(并且可以轻松加速),但是对于不足〜1e6点的东西就足够了.

With very large numbers of points, this exact method will become slow (and can be sped up easily), but it's sufficient for anything less than ~1e6 points.

这篇关于Python Numpy:基于坐标创建二维值数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆