matplotlib.mlab.griddata非常慢,并且在输入有效数据时返回nan数组 [英] matplotlib.mlab.griddata very slow and returns array of nan when valid data is input
问题描述
我正在尝试将不规则网格的数据集(原始卫星数据)与相关的纬度和经度映射到由basemap.makegrid()
给出的规则网格化的经度和纬度集.我在安装mpl_toolkits.natgrid
的情况下使用matplotlib.mlab.griddata
.以下是ipython中whos
用作输出的变量的列表以及该变量的一些统计信息:
I am trying to map an irregularly gridded dataset (raw satellite data) with associated latitudes and longitudes to a regularly gridded set of latitudes and longitudes given by basemap.makegrid()
. I am using matplotlib.mlab.griddata
with mpl_toolkits.natgrid
installed. Below is a list of the variables being used as output by whos
in ipython and some stats on the variables:
Variable Type Data/Info
-------------------------------
datalat ndarray 666x1081: 719946 elems, type `float32`, 2879784 bytes (2 Mb)
datalon ndarray 666x1081: 719946 elems, type `float32`, 2879784 bytes (2 Mb)
gridlat ndarray 1200x1000: 1200000 elems, type `float64`, 9600000 bytes (9 Mb)
gridlon ndarray 1200x1000: 1200000 elems, type `float64`, 9600000 bytes (9 Mb)
var ndarray 666x1081: 719946 elems, type `float32`, 2879784 bytes (2 Mb)
In [11]: var.min()
Out[11]: -30.0
In [12]: var.max()
Out[12]: 30.0
In [13]: datalat.min()
Out[13]: 27.339874
In [14]: datalat.max()
Out[14]: 47.05302
In [15]: datalon.min()
Out[15]: -137.55658
In [16]: datalon.max()
Out[16]: -108.41629
In [17]: gridlat.min()
Out[17]: 30.394031556984299
In [18]: gridlat.max()
Out[18]: 44.237140350357713
In [19]: gridlon.min()
Out[19]: -136.17646180595321
In [20]: gridlon.max()
Out[20]: -113.82353819404671
datalat
和datalon
是原始数据坐标
gridlat
和gridlon
是要插值到的坐标
var
包含实际数据
使用这些变量,当我调用griddata(datalon, datalat, var, gridlon, gridlat)
时,它花了长达20分钟的时间才能完成,并返回nan
的数组.通过查看数据,经度和纬度似乎是正确的,原始坐标与新区域的一部分重叠,而一些数据点位于新区域之外.有没有人有什么建议? nan值表明我在做一些愚蠢的事情...
Using these variables, when I call griddata(datalon, datalat, var, gridlon, gridlat)
it has taken as long as 20 minutes to complete and returns an array of nan
. From looking at the data, the latitudes and longitudes appear to be correct with the original coordinates overlapping a portion of the new area and a few data points lying outside of the new area. Does anyone have any suggestions? The nan values suggest that I'm doing something stupid...
推荐答案
看来mlab.griddata
例程可能会对输出数据引入其他不必要的约束.尽管输入位置可以是任何位置,但输出位置必须是常规网格-由于您的示例位于纬度/经度空间中,因此您选择的地图投影可能会违反此规定(即,x/y中的常规网格不是纬度/经度中的常规网格).
It looks like the mlab.griddata
routine may introduce additional constraints on your output data that may not be necessary. While the input locations may be anything, the output locations must be a regular grid - since your example is in lat/lon space, your choice of map projection may violate this (i.e. regular grid in x/y is not a regular grid in lat/lon).
您可以从interpolate.griddata例程="nofollow"> SciPy 作为替代方案-但是,由于调用签名不同,因此您需要将位置变量合并为一个数组:类似
You might try the interpolate.griddata
routine from SciPy as an alternative - you'll need to combine your location variables into a single array, though, since the call signature is different: something like
import scipy.interpolate
data_locations = np.vstack(datalon.ravel(), datalat.ravel()).T
grid_locations = np.vstack(gridlon.ravel(), gridlat.ravel()).T
grid_data = scipy.interpolate.griddata(data_locations, val.ravel(),
grid_locations, method='nearest')
用于最近邻插值.这会将位置放入2列对应于2个维度的数组中.您可能还想在地图投影的变换空间中执行插值.
for nearest-neighbor interpolation. This gets the locations into an array with 2 columns corresponding to your 2 dimensions. You may also want to perform the interpolation in the transformed space of your map projection.
这篇关于matplotlib.mlab.griddata非常慢,并且在输入有效数据时返回nan数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!