如何找到并绘制R中多项式回归曲线的局部最大值? [英] How to find and plot the local maxima of a polynomial regression curve in R?

查看:57
本文介绍了如何找到并绘制R中多项式回归曲线的局部最大值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有农场的产量数据(自变量)和用作预测因子的各种营养素.我使用 lm(y ~ ploy(x,3)) 进行了单变量(三次)线性回归.然后我根据收益率绘制了预测变量 (P),并添加了最佳拟合曲线(图 1).然后我如何找到这条曲线的局部最大值,并在我的图中添加一个点,其中包括这个拟合收益率的值(图 2)?我查看了

图 2 - 我想创建的图:

我的代码:

头(农场)P <- 农场$P产量 <- 农场$产量拟合 <- lm(产量 ~ poly(P,3), data=farm)情节(P,产量,col =lightblue",xlab =P",ylab =产量",main =农场 - 多项式拟合")线(排序(P),拟合(拟合)[订单(P)],col='darkred',type='l',cex=10)

我的数据:

farm <- structure(list(Yield = c(28.3818, 27.0422555556, 31.5444454545,32.2084818182, 25.983, 43.49634, 53.39813333333, 59.19274, 61.23185,63.83512、64.5388、63.08576、63.83954、62.7838333333、64.10366、67.5600666667、67.6325、68.0023、63.02148、64.0025666667、64.19196、60.4893, 65.2803333333, 62.28096, 59.1914, 59.37304, 58.6482333333,58.08474、58.98306、61.52755、61.9972、62.0188833333、61.09072、58.44715、61.06646、59.6103833333、60.57035、67.5646、67.85488、71.5719571429、50.30092、22.80535、62.26542、62.0754333333、58.46464、62.7326833333、65.53482、63.3064666667、63.68004、63.5212166667、64.65068, 66.19655, 66.0726, 66.6404666667, 65.02208, 62.1035666667,61.78824、61.0844166667、60.1723、61.6899333333、57.8784166667、56.8886666667、59.13944、57.26695、69.5792666667、61.9865166667、55.7342、53.7012857143、56.9748166667、56.8706333333、61.3384166667、52.87725、29.2331888889、15.5046、42.943、44.590325、50.09525、52.68065, 53.0983714286, 19.43875, 38.06708, 59.9217666667, 62.0287166667,66.59496, 64.3986333333, 64.4089333333, 64.6951, 63.8205, 63.6122,62.51384、63.2565666667、62.47745、61.42234、62.6233166667、62.0730333333、60.81996、60.6490833333、58.4331333333、59.94638、61.4119333333), P = c(90L, 93L, 97L, 97L, 100L, 106L, 107L, 114L, 118L, 120L,121L、121L、120L、120L、121L、115L、116L、113L、101L、90L、85L、84L, 85L, 85L, 83L, 82L, 82L, 82L, 82L, 84L, 84L, 85L, 84L, 87L,87L, 88L, 88L, 92L, 93L, 95L, 80L, 67L, 85L, 90L, 88L, 90L, 91L,91L, 91L, 91L, 91L, 92L, 88L, 78L, 73L, 71L, 71L, 71L, 71L, 71L,69L、69L、69L、71L、75L、71L、71L、78L、82L、84L、84L、77L、64L、51L、99L、104L、107L、109L、107L、99L、102L、115L、120L、121L、121L、120L、121L、112L、111L、101L、101L、89L、89L、85L、84L、83L, 83L, 82L, 83L, 83L), Mg = c(666L, 667L, 668L, 668L, 668L,670L、670L、671L、672L、672L、673L、673L、673L、673L、673L、645L、645L、636L、594L、553L、535L、534L、534L、534L、534L、534L、534L、534L、534L、534L、534L、543L、540L、570L、568L、576L、576L、577L、577L、577L、574L、572L、575L、576L、576L、576L、577L、577L、577L、577L、577L、577L、574L、567L、565L、564L、564L、564L、564L、564L、564L、564L、564L、564L、565L、564L、553L、519L、509L、509L、509L、508L、505L、502L、668L、669L、670L、670L、670L、668L、669L、672L、672L、673L、673L、673L、673L、636L、636L、594L、594L、553L、553L、534L, 534L, 534L, 534L, 534L, 534L, 534L), S = c(29L, 30L, 31L,31L, 31L, 33L, 33L, 35L, 36L, 36L, 37L, 37L, 36L, 36L, 37L, 37L,37L, 37L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L,38L, 38L, 38L, 38L, 36L, 36L, 36L, 36L, 37L, 37L, 37L, 34L, 31L,35L, 36L, 36L, 36L, 37L, 36L, 37L, 37L, 37L, 37L, 37L, 38L, 39L,38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 39L, 38L, 37L, 37L,37L, 38L, 38L, 36L, 33L, 30L, 31L, 32L, 33L, 34L, 33L, 31L, 32L,35L, 36L, 37L, 37L, 36L, 37L, 37L, 37L, 38L, 38L, 38L, 38L, 38L,38L, 38L, 38L, 38L, 38L, 38L), K = c(537L, 542L, 549L, 549L,554L、563L、565L、576L、584L、586L、589L、588L、587L、587L、588L、565L、566L、557L、516L、477L、460L、459L、460L、459L、456L、455L、455L、454L、455L、457L、458L、463L、461L、478L、476L、482L、482L、489L、491L、493L、470L、447L、477L、485L、482L、485L、487L、486L、487L、487L、487L、488L、495L、512L、517L、514L、513L、513L、512L、512L、510L、509L、509L、513L、519L、513L、495L、454L、443L、446L、446L、435L、414L、392L、552L、561L、565L、569L、565L、551L、557L、579L、586L、589L、589L、587L、589L、555L、554L、515L、516L、476L、475L, 459L, 458L, 457L, 456L, 455L, 456L, 457L), Ca = c(4795L,4796L、4797L、4797L、4797L、4799L、4799L、4800L、4801L、4801L、4802L、4802L、4802L、4802L、4802L、4779L、4779L、4771L、4736L、4701L、4686L、4685L、4685L、4685L、4685L、4685L、4685L、4685L、4685L、4685L、4685L、4789L、4754L、5135L、5100L、5204L、5204L、5205L、5205L、5205L、5202L、5200L、5203L、5204L、5204L、5204L、5205L、5205L、5205L、5205L、5205L、5205L、5125L、4886L、4807L、4806L、4806L、4806L、4806L、4806L、4806L、4806L、4806L、4806L、4807L、4806L、4687L、4329L、4211L、4211L、4211L、4210L、4207L、4204L、4797L、4798L、4799L、4799L、4799L、4797L、4798L、4801L、4801L、4802L、4802L、4802L、4802L、4771L、4771L、4736L、4736L、4701L、4701L、4685L、4685L、4685L、4685L、4685L、4685L、4685L)), .Names = c("Yield", "P", "Mg", "S", "K", "Ca"), row.names = c(NA,100L), class = "data.frame")

解决方案

fit 的系数为您提供回归线的显式多项式方程.因此,您可以通过取该多项式的一阶和二阶导数来分析地找到最大值(或在有多个的情况下的最大值):

fit <- lm(Yield ~ poly(P,3, raw=TRUE), data=farm) # 注意使用raw=TRUE,否则poly返回正交多项式# 绘制数据点with(farm, plot(P, Yield, col=lightblue", ylim=c(0, max(Yield)),xlab = P",ylab = 产量",主要 =农场 - 多项式拟合"))# 添加模型拟合P = seq(min(farm$P),max(farm$P),长度=1000)pred = data.frame(P,产量=预测(拟合,新数据=数据.框架(P)))with(pred, lines(P, Yield, col='darkred', type='l', cex=10))# 模型系数向量cf = 系数(拟合)# 拟合的一阶导数.这只是为了说明;我们不会绘制这个# 方程直接,但我们会找到它的根来得到位置# 局部最大值和最小值.D1 = cf[2] + 2*cf[3]*pred$P + 3*cf[4]*pred$P^2# 一阶导数的根(这些是一阶导数 = 0 的位置).# 使用二次公式求根.D1roots = (-2*cf[3] + c(-1,1)*sqrt((2*cf[3])^2 - 4*3*cf[4]*cf[2]))/(2*3*cf[4])# 两个根的二阶导数.D2atD1roots = 2*cf[3] + 6*cf[4]*D1roots# 局部最大值是二阶导数为负的地方max_x = D1roots[which(D2atD1roots <0)]max_y = cf %*% max_x^(0:3)# 绘制局部最大值点(max_x,max_y,pch=16,col=红色")# 在局部最大值处添加产量值文本(max_x,max_y,标签=round(max_y,2),adj=c(0.5,-1),cex=0.8)

I have yield data for a farm (independent variable) and various nutrients which serve as predictors. I have performed univariate (cubic) linear regression using lm(y ~ ploy(x,3)). Then I plotted the predictor variable (P) against the yield, and added a best fit curve (Figure 1). How do I then find the local maxima of this curve, and add a point to my plot which includes the value of this fitted yield (Figure 2)? I have looked into the findPeaks() function from the quantmod package, but have not been able to implement.

Figure 1 - Plot I have created:

Figure 2 - Plot I would like to create:

My code:

head(farm)
P <- farm$P
Yield <- farm$Yield
fit <- lm(Yield ~ poly(P,3), data=farm)
plot(P, Yield, col="lightblue", xlab = "P", ylab = "Yield", main="Farm - polynomial fit")
lines(sort(P), fitted(fit)[order(P)], col='darkred', type='l', cex=10) 

My data:

farm <- structure(list(Yield = c(28.3818, 27.0422555556, 31.5444454545, 
32.2084818182, 25.983, 43.49634, 53.3981333333, 59.19274, 61.23185, 
63.83512, 64.5388, 63.08576, 63.83954, 62.7838333333, 64.10366, 
67.5600666667, 67.6325, 68.0023, 63.02148, 64.0025666667, 64.19196, 
60.4893, 65.2803333333, 62.28096, 59.1914, 59.37304, 58.6482333333, 
58.08474, 58.98306, 61.52755, 61.9972, 62.0188833333, 61.09072, 
58.44715, 61.06646, 59.6103833333, 60.57035, 67.5646, 67.85488, 
71.5719571429, 50.30092, 22.80535, 62.26542, 62.0754333333, 58.46464, 
62.7326833333, 65.53482, 63.3064666667, 63.68004, 63.5212166667, 
64.65068, 66.19655, 66.0726, 66.6404666667, 65.02208, 62.1035666667, 
61.78824, 61.0844166667, 60.1723, 61.6899333333, 57.8784166667, 
56.8886666667, 59.13944, 57.26695, 69.5792666667, 61.9865166667, 
55.7342, 53.7012857143, 56.9748166667, 56.8706333333, 61.3384166667, 
52.87725, 29.2331888889, 15.5046, 42.943, 44.590325, 50.09525, 
52.68065, 53.0983714286, 19.43875, 38.06708, 59.9217666667, 62.0287166667, 
66.59496, 64.3986333333, 64.4089333333, 64.6951, 63.8205, 63.6122, 
62.51384, 63.2565666667, 62.47745, 61.42234, 62.6233166667, 62.0730333333, 
60.81996, 60.6490833333, 58.4331333333, 59.94638, 61.4119333333
), P = c(90L, 93L, 97L, 97L, 100L, 106L, 107L, 114L, 118L, 120L, 
121L, 121L, 120L, 120L, 121L, 115L, 116L, 113L, 101L, 90L, 85L, 
84L, 85L, 85L, 83L, 82L, 82L, 82L, 82L, 84L, 84L, 85L, 84L, 87L, 
87L, 88L, 88L, 92L, 93L, 95L, 80L, 67L, 85L, 90L, 88L, 90L, 91L, 
91L, 91L, 91L, 91L, 92L, 88L, 78L, 73L, 71L, 71L, 71L, 71L, 71L, 
69L, 69L, 69L, 71L, 75L, 71L, 71L, 78L, 82L, 84L, 84L, 77L, 64L, 
51L, 99L, 104L, 107L, 109L, 107L, 99L, 102L, 115L, 120L, 121L, 
121L, 120L, 121L, 112L, 111L, 101L, 101L, 89L, 89L, 85L, 84L, 
83L, 83L, 82L, 83L, 83L), Mg = c(666L, 667L, 668L, 668L, 668L, 
670L, 670L, 671L, 672L, 672L, 673L, 673L, 673L, 673L, 673L, 645L, 
645L, 636L, 594L, 553L, 535L, 534L, 534L, 534L, 534L, 534L, 534L, 
534L, 534L, 534L, 534L, 543L, 540L, 570L, 568L, 576L, 576L, 577L, 
577L, 577L, 574L, 572L, 575L, 576L, 576L, 576L, 577L, 577L, 577L, 
577L, 577L, 577L, 574L, 567L, 565L, 564L, 564L, 564L, 564L, 564L, 
564L, 564L, 564L, 564L, 565L, 564L, 553L, 519L, 509L, 509L, 509L, 
508L, 505L, 502L, 668L, 669L, 670L, 670L, 670L, 668L, 669L, 672L, 
672L, 673L, 673L, 673L, 673L, 636L, 636L, 594L, 594L, 553L, 553L, 
534L, 534L, 534L, 534L, 534L, 534L, 534L), S = c(29L, 30L, 31L, 
31L, 31L, 33L, 33L, 35L, 36L, 36L, 37L, 37L, 36L, 36L, 37L, 37L, 
37L, 37L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 
38L, 38L, 38L, 38L, 36L, 36L, 36L, 36L, 37L, 37L, 37L, 34L, 31L, 
35L, 36L, 36L, 36L, 37L, 36L, 37L, 37L, 37L, 37L, 37L, 38L, 39L, 
38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 39L, 38L, 37L, 37L, 
37L, 38L, 38L, 36L, 33L, 30L, 31L, 32L, 33L, 34L, 33L, 31L, 32L, 
35L, 36L, 37L, 37L, 36L, 37L, 37L, 37L, 38L, 38L, 38L, 38L, 38L, 
38L, 38L, 38L, 38L, 38L, 38L), K = c(537L, 542L, 549L, 549L, 
554L, 563L, 565L, 576L, 584L, 586L, 589L, 588L, 587L, 587L, 588L, 
565L, 566L, 557L, 516L, 477L, 460L, 459L, 460L, 459L, 456L, 455L, 
455L, 454L, 455L, 457L, 458L, 463L, 461L, 478L, 476L, 482L, 482L, 
489L, 491L, 493L, 470L, 447L, 477L, 485L, 482L, 485L, 487L, 486L, 
487L, 487L, 487L, 488L, 495L, 512L, 517L, 514L, 513L, 513L, 512L, 
512L, 510L, 509L, 509L, 513L, 519L, 513L, 495L, 454L, 443L, 446L, 
446L, 435L, 414L, 392L, 552L, 561L, 565L, 569L, 565L, 551L, 557L, 
579L, 586L, 589L, 589L, 587L, 589L, 555L, 554L, 515L, 516L, 476L, 
475L, 459L, 458L, 457L, 456L, 455L, 456L, 457L), Ca = c(4795L, 
4796L, 4797L, 4797L, 4797L, 4799L, 4799L, 4800L, 4801L, 4801L, 
4802L, 4802L, 4802L, 4802L, 4802L, 4779L, 4779L, 4771L, 4736L, 
4701L, 4686L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 
4685L, 4685L, 4685L, 4789L, 4754L, 5135L, 5100L, 5204L, 5204L, 
5205L, 5205L, 5205L, 5202L, 5200L, 5203L, 5204L, 5204L, 5204L, 
5205L, 5205L, 5205L, 5205L, 5205L, 5205L, 5125L, 4886L, 4807L, 
4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 
4807L, 4806L, 4687L, 4329L, 4211L, 4211L, 4211L, 4210L, 4207L, 
4204L, 4797L, 4798L, 4799L, 4799L, 4799L, 4797L, 4798L, 4801L, 
4801L, 4802L, 4802L, 4802L, 4802L, 4771L, 4771L, 4736L, 4736L, 
4701L, 4701L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L
)), .Names = c("Yield", "P", "Mg", "S", "K", "Ca"), row.names = c(NA, 
100L), class = "data.frame")

解决方案

The coefficients of fit give you an explicit polynomial equation for the regression line. You can therefore find the maximum (or maxima in cases where there's more than one) analytically by taking the first and second derivatives of this polynomial:

fit <- lm(Yield ~ poly(P,3, raw=TRUE), data=farm)  # Note use of raw=TRUE, otherwise poly returns orthogonal polynomials

# Plot data points
with(farm, plot(P, Yield, col="lightblue", ylim=c(0, max(Yield)), 
                xlab = "P", ylab = "Yield", main="Farm - polynomial fit"))

# Add model fit
P = seq(min(farm$P), max(farm$P), length=1000)
pred = data.frame(P,
                  Yield=predict(fit, newdata=data.frame(P)))

with(pred, lines(P, Yield, col='darkred', type='l', cex=10))

# Vector of model coefficients
cf = coef(fit)

# First derivative of fit. This is just for Illustration; we won't plot this
#  equation directly, but we'll find its roots to get the locations of 
#  local maxima and minima.
D1 = cf[2] + 2*cf[3]*pred$P + 3*cf[4]*pred$P^2

# Roots of first derivative (these are locations where first derivative = 0).
#  Use quadratic formula to find roots.
D1roots = (-2*cf[3] + c(-1,1)*sqrt((2*cf[3])^2 - 4*3*cf[4]*cf[2]))/(2*3*cf[4])

# Second derivative at the two roots. 
D2atD1roots =  2*cf[3] + 6*cf[4]*D1roots

# Local maxima are where the second derivative is negative
max_x = D1roots[which(D2atD1roots < 0)]
max_y = cf %*% max_x^(0:3)

# Plot local maxima
points(max_x, max_y, pch=16, col="red")

# Add values of Yield at local maxima
text(max_x, max_y, label=round(max_y,2), adj=c(0.5,-1), cex=0.8)

这篇关于如何找到并绘制R中多项式回归曲线的局部最大值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆