在Python中将不规则间隔的数据重新采样到规则网格 [英] Resampling irregularly spaced data to a regular grid in Python

查看:566
本文介绍了在Python中将不规则间隔的数据重新采样到规则网格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将2D数据重新采样到常规网格中.

I need to resample 2D-data to a regular grid.

这是我的代码的样子:

import matplotlib.mlab as ml
import numpy as np

y = np.zeros((512,115))
x = np.zeros((512,115))

# Just random data for this test:
data = np.random.randn(512,115)

# filling the grid coordinates:    
for i in range(512):
    y[i,:]=np.arange(380,380+4*115,4)

for i in range(115):
    x[:,i] = np.linspace(-8,8,512)
    y[:,i] -=  np.linspace(-0.1,0.2,512)

# Defining the regular grid
y_i = np.arange(380,380+4*115,4)
x_i = np.linspace(-8,8,512)

resampled_data = ml.griddata(x,y,data,x_i,y_i)

(512,115)是2D数据的形状,我已经安装了mpl_toolkits.natgrid.

(512,115) is the shape of the 2D data, and I already installed mpl_toolkits.natgrid.

我的问题是我得到了一个带掩码的数组,其中大多数条目都是nan,而不是主要由常规条目组成且边界处仅是nan的数组.

My issue is that I get back a masked array, where most of the entries are nan, instead of an array that is mostly composed of regular entries and just nan at the borders.

有人可以指出我做错了什么吗?

Could someone point me to what I am doing wrong?

谢谢!

推荐答案

将代码示例与问题的标题进行比较,我觉得您有点困惑...

Comparing your code example to your question's title, I think you're a bit confused...

在示例代码中,您正在创建规则网格化的随机数据,然后将其重新采样到另一个常规网格上.您的示例中的任何地方都没有不规则的数据...

In your example code, you're creating regularly gridded random data and then resampling it onto another regular grid. You don't have irregular data anywhere in your example...

(此外,代码不是按原样运行的,您应该查看 meshgrid 而不是循环生成x和y网格.)

(Also, the code doesn't run as-is, and you should look into meshgrid rather than looping through to generate your x & y grids.)

如果您想对已经定期采样的网格进行重新采样(如您在示例中所做的那样),则比griddata或下面我将要描述的任何方法都更有效. ( scipy.ndimage.map_coordinates 非常适合您的问题,就是这种情况.)

If you're wanting to re-sample an already regularly-sampled grid, as you do in your example, there are more efficient methods than griddata or anything I'm about to describe below. (scipy.ndimage.map_coordinates would be well suited to your problem, it that case.)

但是,根据您的问题,听起来好像您有不规则间隔的数据想要插值到规则网格中.

Based on your question, however, it sounds like you have irregularly spaced data that you want to interpolate onto a regular grid.

在这种情况下,您可能会有这样的几点:

In that case, you might have some points like this:

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt

# Bounds and number of the randomly generated data points
ndata = 20
xmin, xmax = -8, 8
ymin, ymax = 380, 2428

# Generate random data
x = np.random.randint(xmin, xmax, ndata)
y = np.random.randint(ymin, ymax, ndata)
z = np.random.random(ndata)

# Plot the random data points
plt.scatter(x,y,c=z)
plt.axis([xmin, xmax, ymin, ymax])
plt.colorbar()
plt.show()

然后您可以像以前一样对数据进行插值...(接上上述代码段...)

You can then interpolate the data as you were doing before... (Continued from code snippet above...)

# Size of regular grid
ny, nx = 512, 115

# Generate a regular grid to interpolate the data.
xi = np.linspace(xmin, xmax, nx)
yi = np.linspace(ymin, ymax, ny)
xi, yi = np.meshgrid(xi, yi)

# Interpolate using delaunay triangularization 
zi = mlab.griddata(x,y,z,xi,yi)

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

但是,您会注意到网格中有很多工件.这是由于您的x坐标范围是-8至8,而y坐标范围是〜300至〜2500.插值算法试图使事物各向同性,而您可能需要高度各向异性的插值(以便在绘制网格时看起来各向同性).

However, you'll notice that you're getting lots of artifacts in the grid. This is due to the fact that your x coordinates range from -8 to 8, while y coordinates range from ~300 to ~2500. The interpolation algorithm is trying to make things isotropic, while you may want a highly anisotropic interpolation (so that it appears isotropic when the grid is plotted).

要对此进行更正,您需要创建一个新的坐标系以进行插值.没有正确的方法来执行此操作.我在下面使用的方法可以工作,但是最佳"方法在很大程度上取决于您的数据实际代表什么.

To correct for this, you need to create a new coordinate system to do your interpolation in. There is no one right way to do this. What I'm using below will work, but the "best" way depends heavily on what your data actually represents.

(换句话说,用对数据所测量系统的了解来决定如何做.插值总是 正确!除非您 知道结果应该是什么样子 ,并且对插值算法非常熟悉,可以利用先验信息来发挥自己的优势!!还有比Delaunay三角剖分更灵活的插值算法默认情况下也使用griddata,但是对于一个简单的示例就可以了...)

(In other words, use what you know about the system that your data is measuring to decide how to do it. This is always true with interpolation! You should not interpolate unless you know what the result should look like, and are familiar enough with the interpolation algorithm to use that a priori information to your advantage!! There are also much more flexible interpolation algorithms than the Delaunay triangulation that griddata uses by default, as well, but it's fine for a simple example...)

无论如何,一种方法是重新缩放x和y坐标,以使它们的范围大致相同.在这种情况下.我们会将它们从0缩放到1 ...(原谅意大利面条字符串代码...我只是想以此作为示例...)

At any rate, one way to do this is to rescale the x and y coordinates so that they range over roughly the same magnitudes. In this case. we'll rescale them from 0 to 1... (forgive the spaghetti string code... I'm just intending this to be an example...)

# (Continued from examples above...)
# Normalize coordinate system
def normalize_x(data):
    data = data.astype(np.float)
    return (data - xmin) / (xmax - xmin)

def normalize_y(data):
    data = data.astype(np.float)
    return (data - ymin) / (ymax - ymin)

x_new, xi_new = normalize_x(x), normalize_x(xi)
y_new, yi_new = normalize_y(y), normalize_y(yi)

# Interpolate using delaunay triangularization 
zi = mlab.griddata(x_new, y_new, z, xi_new, yi_new)

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

希望有任何帮助...对不起,答案太长了!

Hope that helps, at any rate... Sorry for the length of the answer!

这篇关于在Python中将不规则间隔的数据重新采样到规则网格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆