`nls`无法估算我模型的参数 [英] `nls` fails to estimate parameters of my model
问题描述
我正在尝试估计堆定律的常数。
我有以下数据集 novels_colection
:
I am trying to estimate the constants for Heaps law.
I have the following dataset novels_colection
:
Number of novels DistinctWords WordOccurrences
1 1 13575 117795
2 1 34224 947652
3 1 40353 1146953
4 1 55392 1661664
5 1 60656 1968274
然后我建立下一个函数:
Then I build the next function:
# Function for Heaps law
heaps <- function(K, n, B){
K*n^B
}
heaps(2,117795,.7) #Just to test it works
所以 n =出现单词
和 K
和 B
是应该为常量而定的值,以便找到我对Distinct的预测
So n = Word Occurrences
, and K
and B
are values that should be constants in order to find my prediction of Distinct Words.
我尝试过,但这给了我一个错误:
I tried this but it gives me an error:
fitHeaps <- nls(DistinctWords ~ heaps(K,WordOccurrences,B),
data = novels_collection[,2:3],
start = list(K = .1, B = .1), trace = T)
错误= numericalDeriv(form [[3L]],names(ind),env)中的错误:
求值时缺少值或无穷大模型
关于如何解决此问题或方法以适合函数并获取 K值的任何想法
和 B
?
Any idea in how could I fix this or a method to fit the function and get the values for K
and B
?
推荐答案
如果在 y = K * n ^ B
的两边进行对数变换,得到 log(y)= log(K)+ B * log( n)
。这是 log(y)
和 log(n)
之间的线性关系,因此您可以拟合线性回归模型查找 log(K)
和 B
。
If you take log transform on both sides of y = K * n ^ B
, you get log(y) = log(K) + B * log(n)
. This is a linear relationship between log(y)
and log(n)
, hence you can fit a linear regression model to find log(K)
and B
.
logy <- log(DistinctWords)
logn <- log(WordOccurrences)
fit <- lm(logy ~ logn)
para <- coef(fit) ## log(K) and B
para[1] <- exp(para[1]) ## K and B
这篇关于`nls`无法估算我模型的参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!