如果我想要随机非结构化数据的 3D 样条/平滑插值怎么办? [英] What to do if I want 3D spline/smooth interpolation of random unstructured data?

查看:19
本文介绍了如果我想要随机非结构化数据的 3D 样条/平滑插值怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我受到@James 的这个

联合国结构化数据:

可怕的蟒蛇:

将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt定义 g(x, y):返回 np.exp(-((x-1.0)**2 + (y-1.0)**2))def findit(x, X): # 或者可以使用一些一维插值分数 = (x - X[0])/(X[-1]-X[0])返回分数 * 浮点数(X.shape[0]-1)第 n, nr = 12, 11theta_min, theta_max = 0.2, 1.3r_min, r_max = 0.7, 2.0theta = np.linspace(theta_min, theta_max, nth)r = np.linspace(r_min, r_max, nr)R, TH = np.meshgrid(r, theta)Xp, Yp = R*np.cos(TH), R*np.sin(TH)数组 = g(Xp, Yp)x, y = np.linspace(0.0, 2.0, 200), np.linspace(0.0, 2.0, 200)X, Y = np.meshgrid(x, y)斑点 = g(X, Y)xtest = np.linspace(0.25, 1.75, 40)ytest = np.zeros_like(xtest) + 0.75rtest = np.sqrt(xtest**2 + ytest**2)thetatest = np.arctan2(xtest, ytest)ir = findit(rtest, r)it = findit(thetatest, theta)plt.figure()plt.subplot(2,1,1)plt.scatter(100.0*Xp.flatten(), 100.0*Yp.flatten())plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)保持plt.imshow(blob, origin='lower', cmap='gray')plt.text(5, 5, "不要使用喷气机!", color='white')精确 = g(xtest, ytest)导入 scipy.ndimage.interpolation 作为 spndintndint0 = spndint.map_coordinates(array, [it, ir], order=0)ndint1 = spndint.map_coordinates(array, [it, ir], order=1)ndint2 = spndint.map_coordinates(array, [it, ir], order=2)导入 scipy.interpolate 作为 spintpoints = np.vstack((Xp.flatten(), Yp.flatten())).T # 可以使用 np.array(zip(...))grid_x = xtestgrid_y = np.array([0.75])g0 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='nearest')g1 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='linear')g2 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='cubic')plt.subplot(4,2,5)plt.plot(精确,'或')#plt.plot(ndint0)plt.plot(ndint1)plt.plot(ndint2)plt.title("map_coordinates")plt.subplot(4,2,6)plt.plot(精确,'或')#plt.plot(g0)plt.plot(g1)plt.plot(g2)plt.title("griddata")plt.subplot(4,2,7)#plt.plot(ndint0 - 精确)plt.plot(ndint1 - 精确)plt.plot(ndint2 - 精确)plt.title("错误地图坐标")plt.subplot(4,2,8)#plt.plot(g0 - 精确)plt.plot(g1 - 精确)plt.plot(g2 - 精确)plt.title("错误网格数据")plt.show()seed_points_rand = 2.0 * np.random.random((400, 2))rr = np.sqrt((seed_points_rand**2).sum(axis=-1))第 th = np.arctan2(seed_points_rand[...,1], seed_points_rand[...,0])isinside = (rr>r_min) * (rr<r_max) * (thth>theta_min) * (thth<theta_max)points_rand = seed_points_rand[isinside]Xprand, Yprand = points_rand.T # 解压array_rand = g(Xprand, Yprand)grid_x = xtestgrid_y = np.array([0.75])plt.figure()plt.subplot(2,1,1)plt.scatter(100.0*Xprand.flatten(), 100.0*Yprand.flatten())plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)保持plt.imshow(blob, origin='lower', cmap='gray')plt.text(5, 5, "不要使用喷气机!", color='white')g0rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='nearest')g1rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='linear')g2rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='cubic')plt.subplot(4,2,6)plt.plot(精确,'或')#plt.plot(g0rand)plt.plot(g1rand)plt.plot(g2rand)plt.title("griddata")plt.subplot(4,2,8)#plt.plot(g0rand - 精确)plt.plot(g1rand - 精确)plt.plot(g2rand - 精确)plt.title("错误网格数据")plt.show()

解决方案

好问题!(和漂亮的情节!)

对于非结构化数据,您需要切换回用于非结构化数据的函数.griddata 是一种选择,但使用三角测量和线性插值.这会导致三角形边界处的硬"边.

样条是径向基函数.在 scipy 术语中,您需要

<小时>

样条类型的选择

我选择了 "thin_plate" 作为样条的类型.我们的输入观察点范围从 0 到 1(它们由 np.random.random 创建).请注意,我们的内插值略高于 1,远低于零.这就是过冲".

<小时>

线性样条将完全避免过冲,但最终会出现靶心"模式(尽管远不及 IDW 方法那么严重).例如,这是使用线性径向基函数内插的完全相同的数据.请注意,我们的内插值永远不会高于 1 或低于 0:

<小时>

更高阶的样条将使数据中的趋势更加连续,但会过冲更多.默认的 "multiquadric" 与薄板样条非常相似,但会使事情变得更加连续,并且过冲会更糟:

<小时>

但是,当您使用更高阶的样条,例如 "cubic"(三阶)时:

quintic"(五阶)

只要稍微超出输入数据,您就会得到不合理的结果.

<小时>

无论如何,这里有一个简单的例子来比较随机数据上的不同径向基函数:

将 numpy 导入为 np导入 scipy.interpolate导入 matplotlib.pyplot 作为 pltnp.random.seed(1977)x, y, z = np.random.random((3, 10))yi, xi = np.mgrid[0:1:100j, 0:1:100j]interp_types = ['multiquadric', 'inverse', 'gaussian', 'linear', 'cubic','quintic', 'thin_plate']对于 interp_types 中的种类:interp = scipy.interpolate.Rbf(x, y, z, 函数=种类)zi = interp(xi, yi)图, ax = plt.subplots()ax.plot(x, y, 'ko')im = ax.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')fig.colorbar(im)ax.set(标题=种类)fig.savefig(kind + '.png', dpi=80)plt.show()

I was inspired by this answer by @James to see how griddata and map_coordinates might be used. In the examples below I'm showing 2D data, but my interest is in 3D. I noticed that griddata only provides splines for 1D and 2D, and is limited to linear interpolation for 3D and higher (probably for very good reasons). However, map_coordinates seems to be fine with 3D using higher order (smoother than piece-wise linear) interpolation.

My primary question: if I have random, unstructured data (where I can not use map_coordinates) in 3D, is there some way to get smoother than piece-wise linear interpolation within the NumPy SciPy universe, or at least nearby?

My secondary question: is spline for 3D not available in griddata because it is difficult or tedious to implement, or is there a fundamental difficulty?

The images and horrible python below show my current understanding of how griddata and map_coordinates can or can't be used. Interpolation is done along the thick black line.

STRUCTURED DATA:

UNSTRUCTURED DATA:

Horrible python:

import numpy as np
import matplotlib.pyplot as plt

def g(x, y):
    return np.exp(-((x-1.0)**2 + (y-1.0)**2))

def findit(x, X):  # or could use some 1D interpolation
    fraction = (x - X[0]) / (X[-1]-X[0])
    return fraction * float(X.shape[0]-1)

nth, nr = 12, 11
theta_min, theta_max = 0.2, 1.3
r_min,     r_max     = 0.7, 2.0

theta = np.linspace(theta_min, theta_max, nth)
r     = np.linspace(r_min,     r_max,     nr)

R, TH = np.meshgrid(r, theta)
Xp, Yp  = R*np.cos(TH), R*np.sin(TH)
array = g(Xp, Yp)

x, y = np.linspace(0.0, 2.0, 200), np.linspace(0.0, 2.0, 200)
X, Y = np.meshgrid(x, y)
blob = g(X, Y)

xtest = np.linspace(0.25, 1.75, 40)
ytest = np.zeros_like(xtest) + 0.75

rtest = np.sqrt(xtest**2 + ytest**2)
thetatest = np.arctan2(xtest, ytest)

ir = findit(rtest, r)
it = findit(thetatest, theta)

plt.figure()

plt.subplot(2,1,1)

plt.scatter(100.0*Xp.flatten(), 100.0*Yp.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')

plt.text(5, 5, "don't use jet!", color='white')


exact = g(xtest, ytest)

import scipy.ndimage.interpolation as spndint
ndint0 = spndint.map_coordinates(array, [it, ir], order=0)
ndint1 = spndint.map_coordinates(array, [it, ir], order=1)
ndint2 = spndint.map_coordinates(array, [it, ir], order=2)

import scipy.interpolate as spint
points = np.vstack((Xp.flatten(), Yp.flatten())).T   # could use np.array(zip(...))
grid_x = xtest
grid_y = np.array([0.75])

g0 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='nearest')
g1 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='linear')
g2 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='cubic')


plt.subplot(4,2,5)

plt.plot(exact, 'or')
#plt.plot(ndint0)
plt.plot(ndint1)
plt.plot(ndint2)
plt.title("map_coordinates")

plt.subplot(4,2,6)

plt.plot(exact, 'or')
#plt.plot(g0)
plt.plot(g1)
plt.plot(g2)
plt.title("griddata")

plt.subplot(4,2,7)

#plt.plot(ndint0 - exact)
plt.plot(ndint1 - exact)
plt.plot(ndint2 - exact)
plt.title("error map_coordinates")

plt.subplot(4,2,8)

#plt.plot(g0 - exact)
plt.plot(g1 - exact)
plt.plot(g2 - exact)
plt.title("error griddata")

plt.show()


seed_points_rand = 2.0 * np.random.random((400, 2))
rr = np.sqrt((seed_points_rand**2).sum(axis=-1))
thth = np.arctan2(seed_points_rand[...,1], seed_points_rand[...,0])
isinside = (rr>r_min) * (rr<r_max) * (thth>theta_min) * (thth<theta_max)
points_rand = seed_points_rand[isinside]

Xprand, Yprand = points_rand.T  # unpack
array_rand = g(Xprand, Yprand)

grid_x = xtest
grid_y = np.array([0.75])

plt.figure()

plt.subplot(2,1,1)

plt.scatter(100.0*Xprand.flatten(), 100.0*Yprand.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')


g0rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='nearest')
g1rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='linear')
g2rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='cubic')

plt.subplot(4,2,6)

plt.plot(exact, 'or')
#plt.plot(g0rand)
plt.plot(g1rand)
plt.plot(g2rand)
plt.title("griddata")


plt.subplot(4,2,8)

#plt.plot(g0rand - exact)
plt.plot(g1rand - exact)
plt.plot(g2rand - exact)
plt.title("error griddata")

plt.show()

解决方案

Good question! (and nice plots!)

For unstructured data, you'll want to switch back to functions meant for unstructured data. griddata is one option, but uses triangulation with linear interpolation in between. This leads to "hard" edges at triangle boundaries.

Splines are radial basis functions. In scipy terms, you want scipy.interpolate.Rbf. I'd recommend using function="linear" or function="thin_plate" over cubic splines, but cubic is available as well. (Cubic splines will exacerbate problems with "overshooting" compared to linear or thin-plate splines.)

One caveat is that this particular implementation of radial basis functions will always use all points in your dataset. This is the most accurate and smooth approach, but it scales poorly as the number of input observation points increases. There are several ways around this, but things will get more complex. I'll leave that for another question.

At any rate, here's a simplified example. We'll generate random data and then interpolate it at points that are on a regular grid. (Note that the input is not on a regular grid, and the interpolated points don't need to be either.)

import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)

x, y, z = np.random.random((3, 10))

interp = scipy.interpolate.Rbf(x, y, z, function='thin_plate')

yi, xi = np.mgrid[0:1:100j, 0:1:100j]

zi = interp(xi, yi)

plt.plot(x, y, 'ko')
plt.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
plt.colorbar()

plt.show()


Choice of spline type

I chose "thin_plate" as the type of spline. Our input observations points range from 0 to 1 (they're created by np.random.random). Notice that our interpolated values go slightly above 1 and well below zero. This is "overshooting".


Linear splines will completely avoid overshooting, but you'll wind up with "bullseye" patterns (nowhere near as severe as with IDW methods, though). For example, here's the exact same data interpolated with a linear radial basis function. Notice that our interpolated values never go above 1 or below 0:


Higher order splines will make trends in the data more continuous but will overshoot more. The default "multiquadric" is fairly similar to a thin-plate spline, but will make things a bit more continuous and overshoot a bit worse:


However, as you go to even higher order splines such as "cubic" (third order):

and "quintic" (fifth order)

You can really wind up with unreasonable results as soon as you move even slightly beyond your input data.


At any rate, here's a simple example to compare different radial basis functions on random data:

import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)

x, y, z = np.random.random((3, 10))
yi, xi = np.mgrid[0:1:100j, 0:1:100j]

interp_types = ['multiquadric', 'inverse', 'gaussian', 'linear', 'cubic',
                'quintic', 'thin_plate']

for kind in interp_types:
    interp = scipy.interpolate.Rbf(x, y, z, function=kind)
    zi = interp(xi, yi)

    fig, ax = plt.subplots()
    ax.plot(x, y, 'ko')
    im = ax.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
    fig.colorbar(im)
    ax.set(title=kind)
    fig.savefig(kind + '.png', dpi=80)

plt.show()

这篇关于如果我想要随机非结构化数据的 3D 样条/平滑插值怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆