如何在两个不规则数据集之间进行插值? [英] How to interpolate points between two irregular sets of data?

查看:235
本文介绍了如何在两个不规则数据集之间进行插值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于标题有些令人困惑,我感到抱歉,但是我不确定如何将其归纳起来.

I'm sorry for the somewhat confusing title, but I wasn't sure how to sum this up any clearer.

我有两组X,Y数据,每组对应一个一般的总体值.他们从原始数据中进行了相当密集的采样.我正在寻找的是一种为给定的Y查找插值X的方法,以获取已经存在的集合之间的值.

I have two sets of X,Y data, each set corresponding to a general overall value. They are fairly densely sampled from the raw data. What I'm looking for is a way to find an interpolated X for any given Y for a value in between the sets I already have.

该图使这一点更加清晰:

The graph makes this more clear:

在这种情况下,红线来自与100对应的集合,黄线来自与50对应的集合.

In this case, the red line is from a set corresponding to 100, the yellow line is from a set corresponding to 50.

我想可以说,假设这些集合对应于一个梯度值(即使它们显然是由离散的X,Y测量值组成),我如何找到假设,如果X对于对应于75的值的集合,Y为500?

I want to be able to say, assuming these sets correspond to a gradient of values (even though they are clearly made up of discrete X,Y measurements), how do I find, say, where the X would be if the Y was 500 for a set that corresponded to a value of 75?

在这里的示例中,我希望我的期望点在此处附近:

In the example here I would expect my desired point to be somewhere around here:

我不需要此功能过于花哨-它可以是数据点的简单线性插值.我只是想不通.

I do not need this function to be overly fancy — it can be simple linear interpolation of data points. I'm just having trouble thinking it through.

请注意,两组的X和Y都不完全重叠.但是,这些集合共享的最近的X点在哪里"或这些集合共享的最近的Y点在哪里"是很琐碎的.

Note that neither the Xs nor the Ys of the two sets overlap perfectly. However it is rather trivial to say, "where are the nearest X points these sets share," or "where are the nearest Y points these sets share."

我使用了已知值之间的简单插值(例如,找到集合"50"和"100"的对应Y的X,然后取平均值以获得"75"),最后得到的结果如下所示:

I have used simple interpolation between known values (e.g. find the X for corresponding Ys for set "50" and "100", then average those to get "75") and I end up with something like that looks like this:

很明显,我在这里做错了.显然,在这种情况下,对于所有Y都比最低"集合的最大Y高的情况,X(正确)返回为0.事情开始很好,但是当某个人开始接近最低的最大Y值时,事情就开始混乱了.

So clearly I am doing something wrong here. Obviously in this case X is (correctly) returning as 0 for all of those cases where the Y is higher than the maximum Y of the "lowest" set. Things start out great but somewhere around when one starts to approach the maximum Y for the lowest set it starts going haywire.

很容易看出我为什么会出错.这是解决问题的另一种方法:

It's easy to see why mine is going wrong. Here's another way to look at the problem:

在正确的"版本中,X应该大约为250.实际上,我正在做的平均是400和0,所以X是200.在这种情况下,我该如何求解X?我当时以为双线性插值法可能会找到答案,但是我一直无法找到答案,这使我清楚该如何处理这类事情,因为它们似乎都是针对某些不同的问题而构造的.

In the "correct" version, X ought to be about 250. Instead, what I'm doing is essentially averaging 400 and 0 so X is 200. How do I solve for X in such a situation? I was thinking that bilinear interpolation might hold the answer but nothing I've been able to find on that has made it clear how I'd go about this sort of thing, because they all seem to be structured for somewhat different problems.

感谢您的帮助.请注意,虽然我显然已经在R中绘制了以上数据,以便于查看我在说什么,但最终的工作是在Javascript和PHP中进行的.我不是在寻找重负荷的东西.简单更好.

Thank you for your help. Note that while I have obviously graphed the above data in R to make it easy to see what I'm talking about, the final work for this is in Javascript and PHP. I'm not looking for something heavy duty; simple is better.

推荐答案

上帝,我终于明白了.这是最终结果:

Good lord, I finally figured it out. Here's the end result:

美丽!但这是很多工作.

Beautiful! But what a lot of work it was.

我的代码太杂乱,对我的项目太具体,以至于对其他任何人都没有用.但是,这是潜在的逻辑.

My code is too cobbled and too specific to my project to be of much use to anyone else. But here's the underlying logic.

您必须具有两组数据才能进行插值.我称这些为外部"曲线和内部"曲线.假定外部"曲线完全包含内部"曲线,而不与内部"曲线相交.曲线实际上只是X,Y数据的集合,并且对应于定义为Z的一组值.在此处使用的示例中,外部"曲线对应于Z = 50,内部"曲线对应于Z = 100

You have to have two sets of data to interpolate from. I am calling these the "outer" curve and the "inner" curve. The "outer" curve is assumed to completely encompass, and not intersect with, the "inner" curve. The curves are really just sets of X,Y data, and correspond to a set of values defined as Z. In the example used here, the "outer" curve corresponds to Z = 50 and the "inner" curve corresponds to Z = 100.

仅重申一下,目标是为任意给定的Y查找X,其中Z是我们已知数据点之间的某个数字.

The goal, just to reiterate, is to find X for any given Y where Z is some number in between our known points of data.

  1. 首先计算出未知Z代表的两个曲线集之间的百分比.因此,如果在我们的示例中Z = 75,则得出的值为0.5.如果Z = 60,那将是0.2.如果Z = 90,则为0.8.将此比例称为P.

  1. Start by figuring out the percentage between the two curve sets that the unknown Z represents. So if Z=75 in our example then that works out to be 0.5. If Z = 60 that would be 0.2. If Z = 90 then that would be 0.8. Call this proportion P.

在外部"曲线上选择数据点,其中Y =所需的Y.想象一下该点与0,0之间的线段.将其定义为AB.

Select the data point on the "outer" curve where Y = your desired Y. Imagine a line segment between that point and 0,0. Define that as AB.

我们想找到AB与内部"曲线相交的位置.为此,我们迭代内部曲线上的每个点.将所选点和点+1之间的线段定义为CD.检查AB和CD是否相交.如果没有,请继续进行迭代,直到他们这样做为止.

We want to find where AB intersects with the "inner" curve. To do this, we iterate through each point on the inner curve. Define the line segment between the chosen point and the point+1 as CD. Check if AB and CD intersect. If not, continue iterating until they do.

找到AB-CD交点后,我们现在看一下该交点所创建的线和步骤2的外部"曲线上的原始点.线的斜率(要在图表向下"继续)的内部和外部曲线将与0,0相交.将此新线段定义为EF.

When we find an AB-CD intersection, we now look at the line created by the intersection and our original point on the "outer" curve from step 2. This line segment, then, is a line between the inner and outer curve where the slope of the line, were it to be continued "down" the chart, would intersect with 0,0. Define this new line segment as EF.

找到EF长度的P百分比(从步骤1开始)的位置.检查Y值.是我们想要的Y值吗?如果是(不太可能),则返回该点的X.如果不是,请查看Y是否小于目标Y.如果是,则将该点的位置存储在变量中,我将其复制为lowY.然后再次返回步骤2,以获取外部曲线上的下一个点.如果 大于目标Y,请查看lowY是否具有值.如果是这样,则在两个值之间进行插值并返回插值的X.(换句话说,我们将所需的坐标装箱"了.)

Find the position at P percent (from step 1) of the length of EF. Check the Y value. Is it our desired Y value? If it is (unlikely), return the X of that point. If not, see if Y is less than the goal Y. If it is, store the position of that point in a variable, which I'll dub lowY. Then go back to step 2 again for the next point on the outer curve. If it is greater than the goal Y, see if lowY has a value in it. If it does, interpolate between the two values and return the interpolated X. (We have "boxed in" our desired coordinate, in other words.)

上面的过程效果很好.在Y = 0的情况下会失败,但是这样做很容易,因为您可以对这两个特定点进行插值.在样本数量少得多的地方,会产生一些锯齿状的结果,但是我想这是可以预期的(Z = 5000,6000,7000,8000,9000,10000,其中只有5000和10000是已知点并且它们每个只有20个数据点-其余的是插值的):

The above procedure works pretty well. It fails in the case of Y=0 but it is easy to do that one since you can just do interpolation on those two specific points. In places where the number of sample is much less, it produces kind of jaggy results, but I guess that's to be expected (these are Z = 5000,6000,7000,8000,9000,10000, where only 5000 and 10000 are known points and they have only 20 datapoints each — the rest are interpolated):

我毫不夸张地说这是一个优化的解决方案,但是在我的计算机上解决点滴问题实际上是瞬时的,因此我认为这对于现代机器来说不是太费力,至少在我拥有的总点数方面(每条曲线30-50).

I am under no pretensions that this is an optimized solution, but solving for gobs of points is practically instantaneous on my computer so I assume it is not too taxing for a modern machine, at least with the number of total points I have (30-50 per curve).

感谢大家的帮助;通过一点点的讨论,这很有帮助,并且意识到我真正想要的不是任何简单的线性插值,而是沿着曲线的一种径向"插值.

Thanks for everyone's help; it helped a lot to talk this through a bit and realize that what I was really going for here was not any simple linear interpolation but a kind of "radial" interpolation along the curve.

这篇关于如何在两个不规则数据集之间进行插值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆