什么时候在loess()上选择nls()? [英] When to choose nls() over loess()?

查看:150
本文介绍了什么时候在loess()上选择nls()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一些(x,y)数据,我可以很容易地通过它画直线,例如

If I have some (x,y) data, I can easily draw straight-line through it, e.g.

f=glm(y~x)
plot(x,y)
lines(x,f$fitted.values)

但是对于弯曲的数据,我想要一条弯曲的线。似乎可以使用loess():

But for curvy data I want a curvy line. It seems loess() can be used:

f=loess(y~x)
plot(x,y)
lines(x,f$fitted)

这个问题随着我的发展而发展输入并研究它。我开始是想要一个简单的函数来适应弯曲的数据(其中我对数据一无所知),并想了解如何使用 nls() optim()可以做到这一点。这就是每个人在我发现的类似问题中似乎都在暗示的内容。但是现在我偶然发现了 loess()我很高兴。所以,现在我的问题是为什么有人会选择使用 nls optim 而不是 loess (或 smooth.spline )?用工具箱类比,是 nls 起子,而 loess 是电动起子(这意味着我几乎总是选择后者是因为它做同样的事情,但是用了我较少的精力)?或者是 nls 一字螺丝刀和 loess 一字螺丝刀(这意味着黄土更适合某些人问题,但对于其他人却根本无法胜任)?

This question has evolved as I've typed and researched it. I started off with wanting to a simple function to fit curvy data (where I know nothing about the data), and wanting to understand how to use nls() or optim() to do that. That was what everyone seemed to be suggesting in similar questions I found. But now I stumbled upon loess() I'm happy. So, now my question is why would someone choose to use nls or optim instead of loess (or smooth.spline)? Using the toolbox analogy, is nls a screwdriver and loess is a power-screwdriver (meaning I'd almost always choose the latter as it does the same thing but with less of my effort)? Or is nls a flat-head screwdriver and loess a cross-head screwdriver (meaning loess is a better fit for some problems, but for others it simply won't do the job)?

作为参考,以下是我使用的播放数据 loess 给出令人满意的结果:

For reference, here is the play data I was using that loess gives satisfactory results for:

x=1:40
y=(sin(x/5)*3)+runif(x)

并且:

x=1:40
y=exp(jitter(x,factor=30)^0.5)

可悲的是,它在此方面的效果较差:

Sadly, it does less well on this:

x=1:400
y=(sin(x/20)*3)+runif(x)

nls()或任何其他函数或库都可以在没有提示的情况下(即,没有被告知这是一个正弦波)来处理此示例和上一个exp示例吗?

Can nls(), or any other function or library, cope with both this and the previous exp example, without being given a hint (i.e. without being told it is a sine wave)?

更新:关于stackoverflow相同主题的一些有用页面:

UPDATE: Some useful pages on the same theme on stackoverflow:

R中的拟合函数优度

如何将平滑曲线拟合到R中的数据?

smooth.spline在我的第一个和第三个示例中,开箱即用给出了很好的结果,但是在第二个示例中,开箱即用的结果很糟糕(仅将点连接起来)。但是f = smooth.spline(x,y,spar = 0.5)对这三个都很好。

smooth.spline "out of the box" gives good results on my 1st and 3rd examples, but terrible (it just joins the dots) on the 2nd example. However f=smooth.spline(x,y,spar=0.5) is good on all three.

更新#2:gam()(来自mgcv软件包)很棒到目前为止:更好的情况下,它与loess()的结果相似,而更好的情况下,其结果与smooth.spline()相似。并且所有都没有提示或额外的参数。到目前为止,文档远在我头上,我觉得自己斜视在头顶飞过的飞机上。但是发现了一些试验和错误:

UPDATE #2: gam() (from mgcv package) is great so far: it gives a similar result to loess() when that was better, and a similar result to smooth.spline() when that was better. And all without hints or extra parameters. The docs were so far over my head I felt like I was squinting at a plane flying overhead; but a bit of trial and error found:

#f=gam(y~x)    #Works just like glm(). I.e. pointless
f=gam(y~s(x)) #This is what you want
plot(x,y)
lines(x,f$fitted)


推荐答案

非线性最小二乘法是拟合非线性模型的一种方法。参数。通过拟合模型,我的意思是响应和协变量之间存在某种先验指定形式,其中一些未知参数需要估计。由于模型在这些参数中是非线性的,因此NLS是通过以迭代方式最小化最小二乘准则来估计这些系数的值的方法。

Nonlinear-least squares is a means of fitting a model that is non-linear in the parameters. By fitting a model, I mean there is some a priori specified form for the relationship between the response and the covariates, with some unknown parameters that are to be estimated. As the model is non-linear in these parameters NLS is a means to estimate values for those coefficients by minimising a least-squares criterion in an iterative fashion.

LOESS作为平滑散点图的一种方法。它具有非常不明确的模型概念的概念(IIRC没有模型)。 LOESS通过尝试识别响应和协变量之间的关系中的模式而工作,而无需用户指定该关系的形式。 LOESS从数据本身计算出关系。

LOESS was developed as a means of smoothing scatterplots. It has a very less well defined concept of a "model" that is fitted (IIRC there is no "model"). LOESS works by trying to identify pattern in the relationship between response and covariates without the user having to specify what form that relationship is. LOESS works out the relationship from the data themselves.

这是两个根本不同的想法。如果知道数据应遵循特定模型,则应使用NLS拟合该模型。您总是可以比较两个拟合值(NLS与LOESS),以查看假定模型等是否存在系统差异-但这会显示在NLS残差中。

These are two fundamentally different ideas. If you know the data should follow a particular model then you should fit that model using NLS. You could always compare the two fits (NLS vs LOESS) to see if there is systematic variation from the presumed model etc - but that would show up in the NLS residuals.

您可以考虑通过 gam()安装在推荐软件包 mgcv 中的通用加性模型(GAM),而不是LOESS。这些模型可以看作是惩罚性回归问题,但可以像拟合LOESS一样从数据中估计拟合的平滑函数。 GAM扩展了GLM以允许平滑,任意的协变量函数。

Instead of LOESS, you might consider Generalized Additive Models (GAMs) fitted via gam() in recommended package mgcv. These models can be viewed as a penalised regression problem but allow for the fitted smooth functions to be estimated from the data like they are in LOESS. GAM extends GLM to allow smooth, arbitrary functions of covariates.

这篇关于什么时候在loess()上选择nls()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆