将曲线拟合到数据,获得分析形式,然后检查曲线何时超过阈值 [英] Fit curve to data, get analytical form, & check when curve crosses threshold
问题描述
每条曲线有40个点,我想使函数平滑并估计曲线何时在y轴上超过阈值。
是否有适合我使用的拟合函数,可以使用插值法绘制新函数,但无法弄清楚如何请求y =阈值的x值。
不幸的是,曲线的形状并不相同,所以我不能使用scipy.optimize.curve_fit。
谢谢!
更新:添加两条曲线:
曲线1
[942.153,353.081,53.088,125.110 ,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662 ,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688]
曲线2
[-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047 ,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938 .586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523 ,-5878.675,-5193.578,-7193.980]
x值是
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21 ,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]
要拟合平滑曲线,可以拟合
曲线拟合得很好。
< h2>何时超过阈值?
要检查如果每个系列中都超过了阈值
,则可以执行以下操作:
用于x,y在zip(fitted_vals_curve1 [0],fit_vals_curve1 [1])中:
如果y>阈值:
xcross_curve1 = x
打破
表示zip中的x,y(fitted_vals_curve2 [0],fit_vals_curve2 [1]):
如果y>阈值:
xcross_curve2 = x
休息
xcross_curve1
和 xcross_curve2
将保存 x
的值,其中 curve1
和 curve2
越过阈值
,如果它们确实越过了阈值
;
让我们对它们进行绘图以检查其是否有效(
(以及文本输出: curve2未传递g阈值
。)
如果您想提高 xcross_curve1
或<$ c $的准确性c> xcross_curve2 ,您可以增加上面定义的学位
和 n
。
从Legendre到多项式形式
我们拟合了一条曲线,其大致形式为:
其中, P_n
是 n
Legendre多项式, s(x)
是一些将 x
转换为 P_n
期望(一些我们现在不需要知道的数学知识)。
我们希望我们的拟合线形式为:
我们将使用 legendre()
的
legendredict [4]
是:
isarray([0.00000000 e + 00、0.00000000e + 00、0.00000000e + 00、0.00000000e + 00,
0.00000000e + 00、0.00000000e + 00、3.29634565e + 05、3.65967884e-11,
-2.82543913e +05,1.82983942e-11,2.82543913e + 04])
在 sum P_n
s( f(x)
,上面),我们有 q_4
的 P_4
,等于具有 2.82543913e + 04
> 1 s, 1.82983942e-11
的 x
, -2.82543913e + 05
的<$ c c $ c> x ^ 2 等,仅来自 P_4
组件。
所以,如果我们想知道多少 1
s, x
s, x ^ 2
s,等等,我们需要形成多项式和,我们需要添加 1
s, x
s, x ^ 2
s等来自所有不同 P_n
s的等。这就是我们要做的:
polycoeffs = np.sum(np.stack(list(legendredict.values())),axis = 0)
然后让我们形成一个多项式总和:
对于icoef,枚举系数(reversed(polycoeffs)):
print(str(coef)+'* s(x)**'+ str(icoef),end ='\n +')
提供输出:
- 874.1456709637822 * s(x)** 0
+ 2893.7228005540596 * s(x)** 1
+ 50415.38472217957 * s(x)** 2
+ -6979.322584205707 * s(x)* * 3
+ -453363.49985790614 * s(x)** 4
+ -250464.7549807652 * s(x)** 5
+ 1250129.5521521813 * s(x)** 6
+ 1267709.5031024509 * s(x)** 7
+ -493280.0177807359 * s(x)** 8
+ -795684.224334346 * s(x)** 9
+ -134370.1696946264 * s( x)** 10
+
(我们将忽略最后一个 +
符号,此处的格式不是重点。)
我们需要计算 s(x)
也一样如果我们使用Jupyter Notebook /
从这里我们可以清楚地看到 s(x)
是 -1.0512820512820513 + 0.05128205128205128x
。如果我们想以更编程的方式进行操作:
2 /(legendrefit_curve1.domain [1] -legendrefit_curve1.domain [0])
是 0.05128205128205128
&
-1-2 /(legendrefit_curve1.domain [1] -legendrefit_curve1.domain [0])
只是 -1.0512820512820513
出于某些数学原因,这是正确的,此处的相关性不高(
所以让我们用 s(x)<
替换为显式函数:
对于icoef,枚举中的coef(reversed(polycoeffs)):
print(str(coef)+'*(-1.0512820512512820513 + 0512820512820513 * x)**'+ str(icoef),end ='\n +')
提供输出:
-874.1456709637822 *(-1.0512820512820513 + 0512820512820513 * x)** 0
+2893.7228005540596 *(-1.0512820512820513 + 0512820512820513 * x)** 1
+50415.38472217957 *(-1.0512820512820513 + 0512820512820513 * x)** 2
+ -6979.322584205707 *(-1.0512820512820820 + 0512820512820513 * x)** 3
+ -453363.49985790614 *(-1.0512820512820513 + 0512820512820513 * x)** 4
+ -250464.7549807652 *(-1.0512820512820820513 + 0512820512820513 * x)** 5
+1250129.5521521813 *(-1.0512820512820513 + 0512820512820513 * x)** 6
+1267709.5031024509 *(-1.0512820512820513 + 0512820512820513 * x)** 7
+ -493280.0177807359 *(-1.0512820512820513 + 0512820512820513 * x)** 8
+ -795684.224334346 * (-1.0512820512820513 + 0512820512820513 * x)** 9
+ -134370.1696946264 *(-1.0512820512820513 + 0512820512820513 * x)** 10
+
可以根据需要进行简化。 (忽略最后一个 +
符号。)
如果想要更高(更低)的学位多项式拟合,刚好适应勒让德多项式的较高(较低)度。
I have 40 points for each curve and I would like to smooth the function and estimate when the curve crosses a threshold on the y axis. Is there a fitting function that I can easily apply this to, I can use interpolate to plot the new function but I can't figure out how to request the x value for which y = threshold.
Unfortunately the curves don't all have the same shape so I can't use scipy.optimize.curve_fit.
Thanks!
Update: Adding two curves:
Curve 1
[942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688]
Curve 2
[-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980]
x values are
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]
For fitting a smooth curve, you can fit Legendre polynomials using numpy.polynomial.legendre.Legendre's fit method.
# import packages we need later
import matplotlib.pyplot as plt
import numpy as np
Fitting Legendre Polynomials
Preparing data as numpy arrays:
curve1 = \
np.asarray([942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688])
curve2 = \
np.asarray([-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980])
xvals = \
np.asarray([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40])
Lets fit the Legendre polynomials, degree
being the highest degree polynomial used, the first few is here for example.
degree=10
legendrefit_curve1 = np.polynomial.legendre.Legendre.fit(xvals, curve1, deg=degree)
legendrefit_curve2 = np.polynomial.legendre.Legendre.fit(xvals, curve2, deg=degree)
Calculate these fitted curves at evenly spaced points using the linspace method. n
is the number of point pairs we want to have.
n=100
fitted_vals_curve1 = legendrefit_curve1.linspace(n=n)
fitted_vals_curve2 = legendrefit_curve2.linspace(n=n)
Let's plot the result, along with a threshold
(using axvline):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
threshold=100000
plt.axhline(y=threshold)
The curves fit beautifully.
When is the threshold crossed?
To check where the threshold
is crossed in each series, you can do:
for x, y in zip(fitted_vals_curve1[0], fitted_vals_curve1[1]):
if y > threshold:
xcross_curve1 = x
break
for x, y in zip(fitted_vals_curve2[0], fitted_vals_curve2[1]):
if y > threshold:
xcross_curve2 = x
break
xcross_curve1
and xcross_curve2
will hold the x
value where curve1
and curve2
crossed the threshold
if they did cross the threshold
; if they did not, they will be undefined.
Let's plot them to check if it works (link to axhline docs):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
plt.axhline(y=threshold)
try: plt.axvline(x=xcross_curve1)
except NameError: print('curve1 is not passing the threshold',c='b')
try: plt.axvline(x=xcross_curve2)
except NameError: print('curve2 is not passing the threshold')
As expected, we get this plot:
(and a text output: curve2 is not passing the threshold
.)
If you would like to increase accuracy of xcross_curve1
or xcross_curve2
, you can increase degree
and n
defined above.
From Legendre to Polynomial form
We have fitted a curve, which roughly has the form:
where P_n
is the n
th Legendre polynomial, s(x)
is some function which transforms x
to the range P_n
expects (some math stuff which we don't need to know now).
We want our fitted line in the form:
We'll use legendre()
of scipy.special:
from scipy.special import legendre
We'll also use use np.pad
(docs, good SO post).
legendredict={}
for icoef, coef in enumerate(legendrefit_curve1.coef):
legendredict[icoef]=coef*np.pad(legendre(icoef).coef,(10-icoef,0),mode='constant')
legendredict
will hold keys
from 0
to 10
, and each value in the dict
will be a list of float
s. The key
is refering to the degree of the Polynomial, and the list of float
s are expressing what are the coefficients of x**n
values within that constituent polynomial of our fit, in a backwards order.
For example:
P_4
is:
legendredict[4]
is:
isarray([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 3.29634565e+05, 3.65967884e-11,
-2.82543913e+05, 1.82983942e-11, 2.82543913e+04])
Meaning that in the sum of P_n
s (f(x)
, above), we have q_4
lot of P_4
, which is equivalent to having 2.82543913e+04
of 1
s, 1.82983942e-11
of x
, -2.82543913e+05
of x^2
, etc, only from the P_4
component.
So if we want to know how much 1
s, x
s, x^2
s, etc we need to form the polynomial sum, we need to add the need for 1
s, x
s, x^2
s, etcs from all the different P_n
s. This is what we do:
polycoeffs = np.sum(np.stack(list(legendredict.values())),axis=0)
Then let's form a polynomial sum:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*s(x)**'+str(icoef),end='\n +')
Giving the output:
-874.1456709637822*s(x)**0
+2893.7228005540596*s(x)**1
+50415.38472217957*s(x)**2
+-6979.322584205707*s(x)**3
+-453363.49985790614*s(x)**4
+-250464.7549807652*s(x)**5
+1250129.5521521813*s(x)**6
+1267709.5031024509*s(x)**7
+-493280.0177807359*s(x)**8
+-795684.224334346*s(x)**9
+-134370.1696946264*s(x)**10
+
(We are going to ignore the last +
sign, formatting is not the main point here.)
We need to calculate s(x)
as well. If we are working in a Jupyter Notebook / Google Colab, only executing a cell with legendrefit_curve1
returns:
From where we can clearly see that s(x)
is -1.0512820512820513+0.05128205128205128x
. If we want to do it in a more programmatic way:
2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
is 0.05128205128205128
&
-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
is just -1.0512820512820513
Which is true for some mathamatical reasons not much relevant here (related Q).
So we can define:
def s(input):
a=-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
b=2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
return a+b*input
Also, lets define, based on the above obtained sum of polynomials of s(x)
:
def polyval(x):
return -874.1456709637822*s(x)**0+2893.7228005540596*s(x)**1+50415.38472217957*s(x)**2+-6979.322584205707*s(x)**3+-453363.49985790614*s(x)**4+-250464.7549807652*s(x)**5+1250129.5521521813*s(x)**6+1267709.5031024509*s(x)**7+-493280.0177807359*s(x)**8+-795684.224334346*s(x)**9+-134370.1696946264*s(x)**10
In a more programmatic way:
def polyval(x):
return sum([coef*s(x)**icoef for icoef, coef in enumerate(reversed(polycoeffs))])
Check that our polynomial indeed fits:
plt.scatter(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve1[0],[polyval(val) for val in fitted_vals_curve1[0]])
It does:
So let's print out our pure polynomial sum, with s(x)
being replaced by an explicit function:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*(-1.0512820512820513+0512820512820513*x)**'+str(icoef),end='\n +')
Giving the output:
-874.1456709637822*(-1.0512820512820513+0512820512820513*x)**0
+2893.7228005540596*(-1.0512820512820513+0512820512820513*x)**1
+50415.38472217957*(-1.0512820512820513+0512820512820513*x)**2
+-6979.322584205707*(-1.0512820512820513+0512820512820513*x)**3
+-453363.49985790614*(-1.0512820512820513+0512820512820513*x)**4
+-250464.7549807652*(-1.0512820512820513+0512820512820513*x)**5
+1250129.5521521813*(-1.0512820512820513+0512820512820513*x)**6
+1267709.5031024509*(-1.0512820512820513+0512820512820513*x)**7
+-493280.0177807359*(-1.0512820512820513+0512820512820513*x)**8
+-795684.224334346*(-1.0512820512820513+0512820512820513*x)**9
+-134370.1696946264*(-1.0512820512820513+0512820512820513*x)**10
+
Which can be simplified, as desired. (Ignore the last +
sign.)
If want a higher (lower) degree polynomial fit, just fit higher (lower) degrees of Legendre polynomials.
这篇关于将曲线拟合到数据,获得分析形式,然后检查曲线何时超过阈值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!