在两个不规则网格之间进行多次插值的加速scipy网格数据 [英] Speedup scipy griddata for multiple interpolations between two irregular grids
问题描述
我在要插入到新网格(x1, y1, z1)
的同一个不规则网格(x, y, z)
上定义了多个值.即,我有f(x, y, z), g(x, y, z), h(x, y, z)
,并且我想计算f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1)
.
目前,我正在使用scipy.interpolate.griddata
进行此操作,并且效果很好.但是,因为我必须分别执行每个插值并且有很多点,所以它很慢,并且在计算中有很多重复(例如,找到最接近的点,设置网格等...).>
是否可以加快计算速度并减少重复的计算?即沿着定义两个网格的路线,然后更改插值的值?
每次调用scipy.interpolate.griddata
都会发生几件事:
- 首先,调用
sp.spatial.qhull.Delaunay
来三角剖分不规则的网格坐标. - 然后,对新网格中的每个点进行三角测量,以查找将其放置在哪个三角形中(实际上,在哪个单纯形中,在您的3D情况下哪个在四面体中).
- 计算每个新网格点相对于封闭单形顶点的重心坐标.
- 使用重心坐标和封闭单形顶点处的函数值,为该网格点计算插值.
对于所有插值,前三个步骤都是相同的,因此,如果您可以为每个新的网格点存储封闭单形顶点的索引和插值权重,则可以通过以下方式最大程度地减少计算量:很多.遗憾的是,尽管确实有可能,但要直接使用可用功能并不容易:
import scipy.interpolate as spint
import scipy.spatial.qhull as qhull
import itertools
def interp_weights(xyz, uvw):
tri = qhull.Delaunay(xyz)
simplex = tri.find_simplex(uvw)
vertices = np.take(tri.simplices, simplex, axis=0)
temp = np.take(tri.transform, simplex, axis=0)
delta = uvw - temp[:, d]
bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
return vertices, np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))
def interpolate(values, vtx, wts):
return np.einsum('nj,nj->n', np.take(values, vtx), wts)
函数interp_weights
对上面列出的前三个步骤进行计算.然后函数interpolate
使用这些计算出的值非常快地执行步骤4:
m, n, d = 3.5e4, 3e3, 3
# make sure no new grid point is extrapolated
bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))
xyz = np.vstack((bounding_cube,
np.random.rand(m - len(bounding_cube), d)))
f = np.random.rand(m)
g = np.random.rand(m)
uvw = np.random.rand(n, d)
In [2]: vtx, wts = interp_weights(xyz, uvw)
In [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))
Out[3]: True
In [4]: %timeit spint.griddata(xyz, f, uvw)
1 loops, best of 3: 2.81 s per loop
In [5]: %timeit interp_weights(xyz, uvw)
1 loops, best of 3: 2.79 s per loop
In [6]: %timeit interpolate(f, vtx, wts)
10000 loops, best of 3: 66.4 us per loop
In [7]: %timeit interpolate(g, vtx, wts)
10000 loops, best of 3: 67 us per loop
所以首先,它与griddata
相同,这很好.其次,设置插值,即计算vtx
和wts
与调用griddata
大致相同.但是,第三,您现在几乎可以立即在同一网格上内插不同的值.
griddata
唯一在这里没有想到的事情是将fill_value
分配给必须外推的点.您可以通过检查至少一个权重为负的点来做到这一点,例如:
def interpolate(values, vtx, wts, fill_value=np.nan):
ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)
ret[np.any(wts < 0, axis=1)] = fill_value
return ret
I have several values that are defined on the same irregular grid (x, y, z)
that I want to interpolate onto a new grid (x1, y1, z1)
. i.e., I have f(x, y, z), g(x, y, z), h(x, y, z)
and I want to calculate f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1)
.
At the moment I am doing this using scipy.interpolate.griddata
and it works well. However, because I have to perform each interpolation separately and there are many points, it is quite slow, with a great deal of duplication in the calculation (i.e finding which points are closest, setting up the grids etc...).
Is there a way to speedup the calculation and reduce the duplicated calculations? i.e something along the lines of defining the two grids, then changing the values for the interpolation?
There are several things going on every time you make a call to scipy.interpolate.griddata
:
- First, a call to
sp.spatial.qhull.Delaunay
is made to triangulate the irregular grid coordinates. - Then, for each point in the new grid, the triangulation is searched to find in which triangle (actually, in which simplex, which in your 3D case will be in which tetrahedron) does it lay.
- The barycentric coordinates of each new grid point with respect to the vertices of the enclosing simplex are computed.
- An interpolated values is computed for that grid point, using the barycentric coordinates, and the values of the function at the vertices of the enclosing simplex.
The first three steps are identical for all your interpolations, so if you could store, for each new grid point, the indices of the vertices of the enclosing simplex and the weights for the interpolation, you would minimize the amount of computations by a lot. This is unfortunately not easy to do directly with the functionality available, although it is indeed possible:
import scipy.interpolate as spint
import scipy.spatial.qhull as qhull
import itertools
def interp_weights(xyz, uvw):
tri = qhull.Delaunay(xyz)
simplex = tri.find_simplex(uvw)
vertices = np.take(tri.simplices, simplex, axis=0)
temp = np.take(tri.transform, simplex, axis=0)
delta = uvw - temp[:, d]
bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
return vertices, np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))
def interpolate(values, vtx, wts):
return np.einsum('nj,nj->n', np.take(values, vtx), wts)
The function interp_weights
does the calculations for the first three steps I listed above. Then the function interpolate
uses those calcualted values to do step 4 very fast:
m, n, d = 3.5e4, 3e3, 3
# make sure no new grid point is extrapolated
bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))
xyz = np.vstack((bounding_cube,
np.random.rand(m - len(bounding_cube), d)))
f = np.random.rand(m)
g = np.random.rand(m)
uvw = np.random.rand(n, d)
In [2]: vtx, wts = interp_weights(xyz, uvw)
In [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))
Out[3]: True
In [4]: %timeit spint.griddata(xyz, f, uvw)
1 loops, best of 3: 2.81 s per loop
In [5]: %timeit interp_weights(xyz, uvw)
1 loops, best of 3: 2.79 s per loop
In [6]: %timeit interpolate(f, vtx, wts)
10000 loops, best of 3: 66.4 us per loop
In [7]: %timeit interpolate(g, vtx, wts)
10000 loops, best of 3: 67 us per loop
So first, it does the same as griddata
, which is good. Second, setting up the interpolation, i.e. computing vtx
and wts
takes roughly the same as a call to griddata
. But third, you can now interpolate for different values on the same grid in virtually no time.
The only thing that griddata
does that is not contemplated here is assigning fill_value
to points that have to be extrapolated. You could do that by checking for points for which at least one of the weights is negative, e.g.:
def interpolate(values, vtx, wts, fill_value=np.nan):
ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)
ret[np.any(wts < 0, axis=1)] = fill_value
return ret
这篇关于在两个不规则网格之间进行多次插值的加速scipy网格数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!