如果我想对随机非结构化数据进行3D样条/平滑插值怎么办? [英] What to do if I want 3D spline/smooth interpolation of random unstructured data?

查看:86
本文介绍了如果我想对随机非结构化数据进行3D样条/平滑插值怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

@James的答案启发了我,以了解如何使用griddatamap_coordinates.在下面的示例中,我显示2D数据,但我对3D感兴趣.我注意到griddata仅提供1D和2D样条曲线,并且仅限于3D及更高版本的线性插值(可能出于很好的原因).但是,map_coordinates似乎可以在3D中使用更高阶(比分段线性更平滑)插值.

I was inspired by this answer by @James to see how griddata and map_coordinates might be used. In the examples below I'm showing 2D data, but my interest is in 3D. I noticed that griddata only provides splines for 1D and 2D, and is limited to linear interpolation for 3D and higher (probably for very good reasons). However, map_coordinates seems to be fine with 3D using higher order (smoother than piece-wise linear) interpolation.

我的主要问题:如果我在3D中具有随机的,非结构化的数据(在这里我无法使用map_coordinates),是否有某种方法可以使NumPy SciPy宇宙中的分段线性插值更加平滑,或至少在附近?

My primary question: if I have random, unstructured data (where I can not use map_coordinates) in 3D, is there some way to get smoother than piece-wise linear interpolation within the NumPy SciPy universe, or at least nearby?

我的第二个问题:griddata中是否没有3D样条线,因为实现起来很困难或乏味,还是有根本的困难?

My secondary question: is spline for 3D not available in griddata because it is difficult or tedious to implement, or is there a fundamental difficulty?

下面的图像和可怕的python显示了我目前对griddata和map_coordinates如何使用或不能使用的理解.沿粗黑线进行插值.

The images and horrible python below show my current understanding of how griddata and map_coordinates can or can't be used. Interpolation is done along the thick black line.

结构化数据:

联合国结构化数据:

UNSTRUCTURED DATA:

可怕的python:

import numpy as np
import matplotlib.pyplot as plt

def g(x, y):
    return np.exp(-((x-1.0)**2 + (y-1.0)**2))

def findit(x, X):  # or could use some 1D interpolation
    fraction = (x - X[0]) / (X[-1]-X[0])
    return fraction * float(X.shape[0]-1)

nth, nr = 12, 11
theta_min, theta_max = 0.2, 1.3
r_min,     r_max     = 0.7, 2.0

theta = np.linspace(theta_min, theta_max, nth)
r     = np.linspace(r_min,     r_max,     nr)

R, TH = np.meshgrid(r, theta)
Xp, Yp  = R*np.cos(TH), R*np.sin(TH)
array = g(Xp, Yp)

x, y = np.linspace(0.0, 2.0, 200), np.linspace(0.0, 2.0, 200)
X, Y = np.meshgrid(x, y)
blob = g(X, Y)

xtest = np.linspace(0.25, 1.75, 40)
ytest = np.zeros_like(xtest) + 0.75

rtest = np.sqrt(xtest**2 + ytest**2)
thetatest = np.arctan2(xtest, ytest)

ir = findit(rtest, r)
it = findit(thetatest, theta)

plt.figure()

plt.subplot(2,1,1)

plt.scatter(100.0*Xp.flatten(), 100.0*Yp.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')

plt.text(5, 5, "don't use jet!", color='white')


exact = g(xtest, ytest)

import scipy.ndimage.interpolation as spndint
ndint0 = spndint.map_coordinates(array, [it, ir], order=0)
ndint1 = spndint.map_coordinates(array, [it, ir], order=1)
ndint2 = spndint.map_coordinates(array, [it, ir], order=2)

import scipy.interpolate as spint
points = np.vstack((Xp.flatten(), Yp.flatten())).T   # could use np.array(zip(...))
grid_x = xtest
grid_y = np.array([0.75])

g0 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='nearest')
g1 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='linear')
g2 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='cubic')


plt.subplot(4,2,5)

plt.plot(exact, 'or')
#plt.plot(ndint0)
plt.plot(ndint1)
plt.plot(ndint2)
plt.title("map_coordinates")

plt.subplot(4,2,6)

plt.plot(exact, 'or')
#plt.plot(g0)
plt.plot(g1)
plt.plot(g2)
plt.title("griddata")

plt.subplot(4,2,7)

#plt.plot(ndint0 - exact)
plt.plot(ndint1 - exact)
plt.plot(ndint2 - exact)
plt.title("error map_coordinates")

plt.subplot(4,2,8)

#plt.plot(g0 - exact)
plt.plot(g1 - exact)
plt.plot(g2 - exact)
plt.title("error griddata")

plt.show()


seed_points_rand = 2.0 * np.random.random((400, 2))
rr = np.sqrt((seed_points_rand**2).sum(axis=-1))
thth = np.arctan2(seed_points_rand[...,1], seed_points_rand[...,0])
isinside = (rr>r_min) * (rr<r_max) * (thth>theta_min) * (thth<theta_max)
points_rand = seed_points_rand[isinside]

Xprand, Yprand = points_rand.T  # unpack
array_rand = g(Xprand, Yprand)

grid_x = xtest
grid_y = np.array([0.75])

plt.figure()

plt.subplot(2,1,1)

plt.scatter(100.0*Xprand.flatten(), 100.0*Yprand.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')


g0rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='nearest')
g1rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='linear')
g2rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='cubic')

plt.subplot(4,2,6)

plt.plot(exact, 'or')
#plt.plot(g0rand)
plt.plot(g1rand)
plt.plot(g2rand)
plt.title("griddata")


plt.subplot(4,2,8)

#plt.plot(g0rand - exact)
plt.plot(g1rand - exact)
plt.plot(g2rand - exact)
plt.title("error griddata")

plt.show()

推荐答案

好问题! (还有漂亮的地块!)

Good question! (and nice plots!)

对于非结构化数据,您将需要切换回用于非结构化数据的功能. griddata是一个选项,但是使用三角测量,并且在两者之间进行线性插值.这会导致在三角形边界处出现硬"边缘.

For unstructured data, you'll want to switch back to functions meant for unstructured data. griddata is one option, but uses triangulation with linear interpolation in between. This leads to "hard" edges at triangle boundaries.

样条曲线是径向基函数.用scipy术语,您需要 scipy.interpolate.Rbf .我建议在三次样条曲线上使用function="linear"function="thin_plate",但是三次也可用. (与线性或薄板样条线相比,三次样条线会加剧超调"问题.)

Splines are radial basis functions. In scipy terms, you want scipy.interpolate.Rbf. I'd recommend using function="linear" or function="thin_plate" over cubic splines, but cubic is available as well. (Cubic splines will exacerbate problems with "overshooting" compared to linear or thin-plate splines.)

一个警告是,径向基函数的这种特定实现将始终使用数据集中的所有点.这是最准确,最平滑的方法,但是随着输入观察点数量的增加,它的伸缩性很差.有几种解决方法,但是事情会变得更加复杂.我会再提一个问题.

One caveat is that this particular implementation of radial basis functions will always use all points in your dataset. This is the most accurate and smooth approach, but it scales poorly as the number of input observation points increases. There are several ways around this, but things will get more complex. I'll leave that for another question.

无论如何,这是一个简化的示例.我们将生成随机数据,然后在常规网格上的点处进行插值. (请注意,输入不是在常规网格上,并且插值点也不必是.)

At any rate, here's a simplified example. We'll generate random data and then interpolate it at points that are on a regular grid. (Note that the input is not on a regular grid, and the interpolated points don't need to be either.)

import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)

x, y, z = np.random.random((3, 10))

interp = scipy.interpolate.Rbf(x, y, z, function='thin_plate')

yi, xi = np.mgrid[0:1:100j, 0:1:100j]

zi = interp(xi, yi)

plt.plot(x, y, 'ko')
plt.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
plt.colorbar()

plt.show()

我选择了"thin_plate"作为样条线的类型.我们的输入观察点范围为0到1(由np.random.random创建).请注意,我们的内插值略高于1且远低于零.这是超调".

I chose "thin_plate" as the type of spline. Our input observations points range from 0 to 1 (they're created by np.random.random). Notice that our interpolated values go slightly above 1 and well below zero. This is "overshooting".

线性样条曲线将完全避免过冲,但是您最终会遇到"bullseye"模式(尽管没有IDW方法那么严重).例如,以下是使用线性径向基函数插值的完全相同的数据.请注意,我们的插值永远不会超过1或低于0:

Linear splines will completely avoid overshooting, but you'll wind up with "bullseye" patterns (nowhere near as severe as with IDW methods, though). For example, here's the exact same data interpolated with a linear radial basis function. Notice that our interpolated values never go above 1 or below 0:

更高阶的样条曲线将使数据中的趋势更加连续,但将使过冲更多.默认的"multiquadric"与薄板样条曲线非常相似,但是会使事情变得更连续并且过冲更糟:

Higher order splines will make trends in the data more continuous but will overshoot more. The default "multiquadric" is fairly similar to a thin-plate spline, but will make things a bit more continuous and overshoot a bit worse:

但是,当您使用更高阶的样条线时,例如"cubic"(三阶):

However, as you go to even higher order splines such as "cubic" (third order):

"quintic"(五阶)

只要您稍微超出输入数据,就可以真正获得不合理的结果.

You can really wind up with unreasonable results as soon as you move even slightly beyond your input data.

无论如何,这是一个比较随机数据上不同径向基函数的简单示例:

At any rate, here's a simple example to compare different radial basis functions on random data:

import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)

x, y, z = np.random.random((3, 10))
yi, xi = np.mgrid[0:1:100j, 0:1:100j]

interp_types = ['multiquadric', 'inverse', 'gaussian', 'linear', 'cubic',
                'quintic', 'thin_plate']

for kind in interp_types:
    interp = scipy.interpolate.Rbf(x, y, z, function=kind)
    zi = interp(xi, yi)

    fig, ax = plt.subplots()
    ax.plot(x, y, 'ko')
    im = ax.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
    fig.colorbar(im)
    ax.set(title=kind)
    fig.savefig(kind + '.png', dpi=80)

plt.show()

这篇关于如果我想对随机非结构化数据进行3D样条/平滑插值怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆