在两个不规则网格之间进行多次插值的加速scipy网格数据 [英] Speedup scipy griddata for multiple interpolations between two irregular grids

查看:251
本文介绍了在两个不规则网格之间进行多次插值的加速scipy网格数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在要插入到新网格(x1, y1, z1)的同一个不规则网格(x, y, z)上定义了多个值.即,我有f(x, y, z), g(x, y, z), h(x, y, z),并且我想计算f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1).

目前,我正在使用scipy.interpolate.griddata进行此操作,并且效果很好.但是,因为我必须分别执行每个插值并且有很多点,所以它很慢,并且在计算中有很多重复(例如,找到最接近的点,设置网格等...).

是否可以加快计算速度并减少重复的计算?即沿着定义两个网格的路线,然后更改插值的值?

解决方案

每次调用scipy.interpolate.griddata都会发生几件事:

  1. 首先,调用sp.spatial.qhull.Delaunay来三角剖分不规则的网格坐标.
  2. 然后,对新网格中的每个点进行三角测量,以查找将其放置在哪个三角形中(实际上,在哪个单纯形中,在您的3D情况下哪个在四面体中).
  3. 计算每个新网格点相对于封闭单形顶点的重心坐标.
  4. 使用重心坐标和封闭单形顶点处的函数值,为该网格点计算插值.

对于所有插值,前三个步骤都是相同的,因此,如果您可以为每个新的网格点存储封闭单形顶点的索引和插值权重,则可以通过以下方式最大程度地减少计算量:很多.遗憾的是,尽管确实有可能,但要直接使用可用功能并不容易:

import scipy.interpolate as spint
import scipy.spatial.qhull as qhull
import itertools

def interp_weights(xyz, uvw):
    tri = qhull.Delaunay(xyz)
    simplex = tri.find_simplex(uvw)
    vertices = np.take(tri.simplices, simplex, axis=0)
    temp = np.take(tri.transform, simplex, axis=0)
    delta = uvw - temp[:, d]
    bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
    return vertices, np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))

def interpolate(values, vtx, wts):
    return np.einsum('nj,nj->n', np.take(values, vtx), wts)

函数interp_weights对上面列出的前三个步骤进行计算.然后函数interpolate使用这些计算出的值非常快地执行步骤4:

m, n, d = 3.5e4, 3e3, 3
# make sure no new grid point is extrapolated
bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))
xyz = np.vstack((bounding_cube,
                 np.random.rand(m - len(bounding_cube), d)))
f = np.random.rand(m)
g = np.random.rand(m)
uvw = np.random.rand(n, d)

In [2]: vtx, wts = interp_weights(xyz, uvw)

In [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))
Out[3]: True

In [4]: %timeit spint.griddata(xyz, f, uvw)
1 loops, best of 3: 2.81 s per loop

In [5]: %timeit interp_weights(xyz, uvw)
1 loops, best of 3: 2.79 s per loop

In [6]: %timeit interpolate(f, vtx, wts)
10000 loops, best of 3: 66.4 us per loop

In [7]: %timeit interpolate(g, vtx, wts)
10000 loops, best of 3: 67 us per loop

所以首先,它与griddata相同,这很好.其次,设置插值,即计算vtxwts与调用griddata大致相同.但是,第三,您现在几乎可以立即在同一网格上内插不同的值.

griddata唯一在这里没有想到的事情是将fill_value分配给必须外推的点.您可以通过检查至少一个权重为负的点来做到这一点,例如:

def interpolate(values, vtx, wts, fill_value=np.nan):
    ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)
    ret[np.any(wts < 0, axis=1)] = fill_value
    return ret

I have several values that are defined on the same irregular grid (x, y, z) that I want to interpolate onto a new grid (x1, y1, z1). i.e., I have f(x, y, z), g(x, y, z), h(x, y, z) and I want to calculate f(x1, y1, z1), g(x1, y1, z1), h(x1, y1, z1).

At the moment I am doing this using scipy.interpolate.griddata and it works well. However, because I have to perform each interpolation separately and there are many points, it is quite slow, with a great deal of duplication in the calculation (i.e finding which points are closest, setting up the grids etc...).

Is there a way to speedup the calculation and reduce the duplicated calculations? i.e something along the lines of defining the two grids, then changing the values for the interpolation?

解决方案

There are several things going on every time you make a call to scipy.interpolate.griddata:

  1. First, a call to sp.spatial.qhull.Delaunay is made to triangulate the irregular grid coordinates.
  2. Then, for each point in the new grid, the triangulation is searched to find in which triangle (actually, in which simplex, which in your 3D case will be in which tetrahedron) does it lay.
  3. The barycentric coordinates of each new grid point with respect to the vertices of the enclosing simplex are computed.
  4. An interpolated values is computed for that grid point, using the barycentric coordinates, and the values of the function at the vertices of the enclosing simplex.

The first three steps are identical for all your interpolations, so if you could store, for each new grid point, the indices of the vertices of the enclosing simplex and the weights for the interpolation, you would minimize the amount of computations by a lot. This is unfortunately not easy to do directly with the functionality available, although it is indeed possible:

import scipy.interpolate as spint
import scipy.spatial.qhull as qhull
import itertools

def interp_weights(xyz, uvw):
    tri = qhull.Delaunay(xyz)
    simplex = tri.find_simplex(uvw)
    vertices = np.take(tri.simplices, simplex, axis=0)
    temp = np.take(tri.transform, simplex, axis=0)
    delta = uvw - temp[:, d]
    bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
    return vertices, np.hstack((bary, 1 - bary.sum(axis=1, keepdims=True)))

def interpolate(values, vtx, wts):
    return np.einsum('nj,nj->n', np.take(values, vtx), wts)

The function interp_weights does the calculations for the first three steps I listed above. Then the function interpolate uses those calcualted values to do step 4 very fast:

m, n, d = 3.5e4, 3e3, 3
# make sure no new grid point is extrapolated
bounding_cube = np.array(list(itertools.product([0, 1], repeat=d)))
xyz = np.vstack((bounding_cube,
                 np.random.rand(m - len(bounding_cube), d)))
f = np.random.rand(m)
g = np.random.rand(m)
uvw = np.random.rand(n, d)

In [2]: vtx, wts = interp_weights(xyz, uvw)

In [3]: np.allclose(interpolate(f, vtx, wts), spint.griddata(xyz, f, uvw))
Out[3]: True

In [4]: %timeit spint.griddata(xyz, f, uvw)
1 loops, best of 3: 2.81 s per loop

In [5]: %timeit interp_weights(xyz, uvw)
1 loops, best of 3: 2.79 s per loop

In [6]: %timeit interpolate(f, vtx, wts)
10000 loops, best of 3: 66.4 us per loop

In [7]: %timeit interpolate(g, vtx, wts)
10000 loops, best of 3: 67 us per loop

So first, it does the same as griddata, which is good. Second, setting up the interpolation, i.e. computing vtx and wts takes roughly the same as a call to griddata. But third, you can now interpolate for different values on the same grid in virtually no time.

The only thing that griddata does that is not contemplated here is assigning fill_value to points that have to be extrapolated. You could do that by checking for points for which at least one of the weights is negative, e.g.:

def interpolate(values, vtx, wts, fill_value=np.nan):
    ret = np.einsum('nj,nj->n', np.take(values, vtx), wts)
    ret[np.any(wts < 0, axis=1)] = fill_value
    return ret

这篇关于在两个不规则网格之间进行多次插值的加速scipy网格数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆