加速 scipy griddata 以在两个不规则网格之间进行多次插值 [英] Speedup scipy griddata for multiple interpolations between two irregular grids

查看:28
本文介绍了加速 scipy griddata 以在两个不规则网格之间进行多次插值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个值定义在同一个不规则网格 (x, y, z) 上,我想插入到新网格 (x1, y1, z1)代码>.即,我有 f(x, y, z), g(x, y, z), h(x, y, z) 并且我想计算 f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1).

目前我正在使用 scipy.interpolate.griddata 执行此操作,并且效果很好.然而,因为我必须单独执行每个插值并且有很多点,所以它很慢,计算中有大量重复(即找到哪些点最近,设置网格等......).

有没有办法加快计算速度,减少重复计算?即沿着定义两个网格的路线,然后更改插值的值?

解决方案

每次调用 scipy.interpolate.griddata 时都会发生几件事情:

  1. 首先,调用 sp.spatial.qhull.Delaunay 对不规则网格坐标进行三角测量.
  2. 然后,对于新网格中的每个点,搜索三角剖分以找出它位于哪个三角形(实际上,在哪个单纯形中,在您的 3D 情况下将在哪个四面体中).
  3. 计算每个新网格点相对于封闭单纯形顶点的重心坐标.
  4. 使用重心坐标和封闭单纯形顶点处的函数值计算该网格点的内插值.

所有插值的前三个步骤都是相同的,因此如果您可以为每个新网格点存储封闭单纯形的顶点索引和插值的权重,则可以将计算量最小化很多.不幸的是,使用可用的功能直接做到这一点并不容易,尽管这确实是可能的:

import scipy.interpolate as spint将 scipy.spatial.qhull 导入为 qhull导入迭代工具def interp_weights(xyz, uvw):tri = qhull.Delaunay(xyz)simplex = tri.find_simplex(uvw)顶点 = np.take(tri.simplices, simplex, axis=0)temp = np.take(tri.transform, simplex,axis=0)delta = uvw - temp[:, d]bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)返回顶点,np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))定义插值(值,vtx,wts):return np.einsum('nj,nj->n', np.take(values, vtx), wts)

函数interp_weights 为我上面列出的前三个步骤进行计算.然后函数 interpolate 使用这些计算出的值非常快地执行第 4 步:

m, n, d = 3.5e4, 3e3, 3# 确保没有新的网格点被外推bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))xyz = np.vstack((bounding_cube,np.random.rand(m - len(bounding_cube), d)))f = np.random.rand(m)g = np.random.rand(m)uvw = np.random.rand(n, d)在 [2]: vtx, wts = interp_weights(xyz, uvw)在 [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))输出[3]:真在 [4]: %timeit spint.griddata(xyz, f, uvw)1 个循环,最好的 3 个:每个循环 2.81 秒在 [5] 中:%timeit interp_weights(xyz, uvw)1 个循环,最好的 3 个:每个循环 2.79 秒在 [6]: %timeit interpolate(f, vtx, wts)10000 个循环,最好的 3 个:每个循环 66.4 us在 [7]: %timeit interpolate(g, vtx, wts)10000 个循环,最好的 3 个:每个循环 67 us

所以首先,它和 griddata 一样,这很好.其次,设置插值,即计算 vtxwts 与调用 griddata 大致相同.但第三,您现在几乎可以立即为同一网格上的不同值进行插值.

griddata 唯一在此未考虑的事情是将 fill_value 分配给必须外推的点.您可以通过检查至少一个权重为负的点来做到这一点,例如:

def interpolate(values, vtx, wts, fill_value=np.nan):ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)ret[np.any(wts <0,axis=1)] = fill_value返回 ret

I have several values that are defined on the same irregular grid (x, y, z) that I want to interpolate onto a new grid (x1, y1, z1). i.e., I have f(x, y, z), g(x, y, z), h(x, y, z) and I want to calculate f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1).

At the moment I am doing this using scipy.interpolate.griddata and it works well. However, because I have to perform each interpolation separately and there are many points, it is quite slow, with a great deal of duplication in the calculation (i.e finding which points are closest, setting up the grids etc...).

Is there a way to speedup the calculation and reduce the duplicated calculations? i.e something along the lines of defining the two grids, then changing the values for the interpolation?

解决方案

There are several things going on every time you make a call to scipy.interpolate.griddata:

  1. First, a call to sp.spatial.qhull.Delaunay is made to triangulate the irregular grid coordinates.
  2. Then, for each point in the new grid, the triangulation is searched to find in which triangle (actually, in which simplex, which in your 3D case will be in which tetrahedron) does it lay.
  3. The barycentric coordinates of each new grid point with respect to the vertices of the enclosing simplex are computed.
  4. An interpolated values is computed for that grid point, using the barycentric coordinates, and the values of the function at the vertices of the enclosing simplex.

The first three steps are identical for all your interpolations, so if you could store, for each new grid point, the indices of the vertices of the enclosing simplex and the weights for the interpolation, you would minimize the amount of computations by a lot. This is unfortunately not easy to do directly with the functionality available, although it is indeed possible:

import scipy.interpolate as spint
import scipy.spatial.qhull as qhull
import itertools

def interp_weights(xyz, uvw):
    tri = qhull.Delaunay(xyz)
    simplex = tri.find_simplex(uvw)
    vertices = np.take(tri.simplices, simplex, axis=0)
    temp = np.take(tri.transform, simplex, axis=0)
    delta = uvw - temp[:, d]
    bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
    return vertices, np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))

def interpolate(values, vtx, wts):
    return np.einsum('nj,nj->n', np.take(values, vtx), wts)

The function interp_weights does the calculations for the first three steps I listed above. Then the function interpolate uses those calcualted values to do step 4 very fast:

m, n, d = 3.5e4, 3e3, 3
# make sure no new grid point is extrapolated
bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))
xyz = np.vstack((bounding_cube,
                 np.random.rand(m - len(bounding_cube), d)))
f = np.random.rand(m)
g = np.random.rand(m)
uvw = np.random.rand(n, d)

In [2]: vtx, wts = interp_weights(xyz, uvw)

In [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))
Out[3]: True

In [4]: %timeit spint.griddata(xyz, f, uvw)
1 loops, best of 3: 2.81 s per loop

In [5]: %timeit interp_weights(xyz, uvw)
1 loops, best of 3: 2.79 s per loop

In [6]: %timeit interpolate(f, vtx, wts)
10000 loops, best of 3: 66.4 us per loop

In [7]: %timeit interpolate(g, vtx, wts)
10000 loops, best of 3: 67 us per loop

So first, it does the same as griddata, which is good. Second, setting up the interpolation, i.e. computing vtx and wts takes roughly the same as a call to griddata. But third, you can now interpolate for different values on the same grid in virtually no time.

The only thing that griddata does that is not contemplated here is assigning fill_value to points that have to be extrapolated. You could do that by checking for points for which at least one of the weights is negative, e.g.:

def interpolate(values, vtx, wts, fill_value=np.nan):
    ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)
    ret[np.any(wts < 0, axis=1)] = fill_value
    return ret

这篇关于加速 scipy griddata 以在两个不规则网格之间进行多次插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆