如何通过反复试验或 R 中更好的特定替代方法将数据集拟合到特定函数? [英] How to fit a data set to an specific function by trial and error or a better specific alternative in R?

查看:25
本文介绍了如何通过反复试验或 R 中更好的特定替代方法将数据集拟合到特定函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,我想调整到以下函数并找到参数 a 和 b:

I have a data set and I want to adjust to the following function and find the parameters a and b:

我尝试了非线性最小二乘法,但是,我想通过反复试验来尝试,使用一个带有 a 值的向量,另一个用于 b 的值,然后绘制混合这些值的所有替代方案以选择更好的拟合.

I tried the nonlinear least squares approach, however, I'd like to try by trial and error, using a vector with values for a, and another for b, then plot all the alternatives mixing this values to choose a better fit.

library(readxl)
library(ggplot2)

x <- c(52.67, 46.80, 41.74, 40.45)
y <- c(1.73, 1.84, 1.79, 1.45)

df <- data.frame(x,y)

ggplot(data = df, aes(x, y))+
  geom_point()+
  stat_smooth(method="nls",
              se=FALSE,
              formula = y ~ (a*b*x)/(1+(b*x)),
              method.args = list(start = c(a=2.86, b=0.032)))

推荐答案

我想知道您是否对 nls 的输出有点不信任,认为也许您可以找到更适合自己的方案?

I wonder if you're a bit mistrustful of the output of nls, thinking that perhaps you could find a better fit yourself?

这里有一种方法至少可以让您更好地感受 ab 的不同值所产生的匹配.这个想法是我们创建一个图,其中 a 的所有值在 x 轴上,b 的所有值在 y 轴上.对于每一对 ab,我们计算出结果曲线与我们的数据的接近程度(通过取对数平方和).如果合身好,我们就用亮色给它上色,如果合身不好,我们就用更深的颜色给它上色.这使我们能够看到能够很好地拟合的组合类型 - 有效的参数热图.

Here's a way to at least give you a better feel for the fit created by different values of a and b. The idea is that we create a plot with all the values of a on the x axis, and all the values of b on the y axis. For each pair of a and b we work out how close the resulting curve would be to our data (by taking the log sum of squares). If the fit is good, we colour it with a bright colour, and if the fit is bad we colour it with a darker colour. This allows us to see the types of combinations that will make good fits - effectively a heat map of the parameters.

# Our actual data, put in a data frame:
df <- data.frame(x = c(52.67, 46.80, 41.74, 40.45), y = c(1.73, 1.84, 1.79, 1.45))

# Create a grid of all a and b values we want to compare
a <- seq(-5, 10, length.out = 200)
b <- seq(0, 0.5, length.out = 100)
all_mixtures <- setNames(expand.grid(a, b), c("a", "b"))

# Get the sum of squares for each point:
all_mixtures$ss <- apply(all_mixtures, 1, function(i) {
  log(sum((i[1] * i[2] * df$x / (1 + i[2] * df$x) - y)^2))
})

现在我们绘制热图:

p <- ggplot(all_mixtures, aes(a, b, fill = ss)) +
  geom_tile() + 
  scale_fill_gradientn(colours = c("white", "yellow", "red", "blue")) 
p

显然,最佳的 ab 对位于白线上的某处.

Clearly, the optimum pair of a and b lie somewhere on the white line.

现在让我们看看 nls 认为 ab 的最佳组合在哪里:

Now let's see where the nls thought the best combination of a and b was:

p + geom_point(aes(x= 2.8312323, y = 0.0334379), size = 5)

它看起来好像在白线的弯曲"处找到了最佳值,这可能是您已经猜到的.

It looks as though it has found the optimum just at the "bend" of the white line, which is probably what you have guessed.

看起来如果你偏离这条白线,你的身体会更差,而且你不会在白线上找到更好的地方.

It looks like if you stray outside this white line, your fit will be worse, and you're not going to find anywhere on the white line that's better.

相信 nls.是的,拟合看起来不太好,但这仅仅是因为数据不太适合这个特定的公式,不管你如何设置它的参数.如果您的模型必须采用这种形式,而这些是您的数据,那么这将是您将获得的最佳拟合.

Trust the nls. Yes, the fit doesn't look very good, but that's simply because the data don't fit this particular formula very well, however you set its parameters. If your model has to be in this form, and these are your data, this is the best fit you are going to get.

这篇关于如何通过反复试验或 R 中更好的特定替代方法将数据集拟合到特定函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆