从gamlss百分位曲线获取百分位值 [英] Getting percentile values from gamlss centile curves
问题描述
此问题与以下问题有关:使用gamlss :: lms在R
This question is related to: Selecting Percentile curves using gamlss::lms in R
我可以从以下数据和代码中获得百分曲线:
I can get centile curve from following data and code:
age = sample(5:15, 500, replace=T)
yvar = rnorm(500, age, 20)
mydata = data.frame(age, yvar)
head(mydata)
age yvar
1 12 13.12974
2 14 -18.97290
3 10 42.11045
4 12 27.89088
5 11 48.03861
6 5 24.68591
h = lms(yvar, age , data=mydata, n.cyc=30)
centiles(h,xvar=mydata$age, cent=c(90), points=FALSE)
我现在如何获得x值(5:15)中的每一个的yvar来表示平滑后是否有90%的数据?
How can I now get yvar on the curve for each of x value (5:15) which would represent 90th percentiles for data after smoothening?
我尝试阅读帮助页面,并找到fit(h)和fv(h)来获取整个数据的拟合值。但是,如何在第90个百分位数曲线水平上获取每个年龄水平的值?谢谢您的帮助。
I tried to read help pages and found fitted(h) and fv(h) to get fitted values for entire data. But how to get values for each age level at 90th centile curve level? Thanks for your help.
编辑:下图显示了我的需要:
Following figure show what I need:
我尝试了以下操作,但由于值不正确,因此是正确的:
I tried following but it is correct since value are incorrect:
mydata$fitted = fitted(h)
aggregate(fitted~age, mydata, function(x) quantile(x,.9))
age fitted
1 5 6.459680
2 6 6.280579
3 7 6.290599
4 8 6.556999
5 9 7.048602
6 10 7.817276
7 11 8.931219
8 12 10.388048
9 13 12.138104
10 14 14.106250
11 15 16.125688
该值与直接根据数据得出的第90个分位数有很大不同:
The values are very different from 90th quantile directly from data:
> aggregate(yvar~age, mydata, function(x) quantile(x,.9))
age yvar
1 5 39.22938
2 6 35.69294
3 7 25.40390
4 8 26.20388
5 9 29.07670
6 10 32.43151
7 11 24.96861
8 12 37.98292
9 13 28.28686
10 14 43.33678
11 15 44.46269
推荐答案
看看这是否有意义。均值和标准差为``smn''和``ssd''的正态分布的第90个百分位数是 qnorm(.9,smn,ssd)
:所以这似乎可以实现(有点)的结果,尽管我建议的百分位数
并非完全破解:
See if this makes sense. The 90th percentile of a normal distribution with mean and sd of 'smn' and 'ssd' is qnorm(.9, smn, ssd)
: So this seems to deliver (somewhat) sensible results, albeit not the full hack of centiles
that I suggested:
plot(h$xvar, qnorm(.9, fitted(h), h$sigma.fv))
(请注意,只有少数几个不同的xvar却有500个点的大量绘图。Ande您可能希望设置ylim,以便可以欣赏整个范围。)
(Note the massive overplotting from only a few distinct xvars but 500 points. Ande you may want to set the ylim so that the full range can be appreciated.)
此处的警告是您需要检查模型的其他部分,以查看它是否真的只是普通的普通模型。在这种情况下,似乎是:
The caveat here is that you need to check the other parts of the model to see if it is really just an ordinary Normal model. In this case it seems to be:
> h$mu.formula
y ~ pb(x)
<environment: 0x10275cfb8>
> h$sigma.formula
~1
<environment: 0x10275cfb8>
> h$nu.formula
NULL
> h$tau.formula
NULL
所以模型只是固定的均值估计-$ xvar
范围内的-variance(〜1
),并且像这样的高阶参数不会带来任何并发症Box-Cox模型。 (而且我无法解释为什么这与绘制的百分位数不同。为此,您可能需要与软件包作者相对应。)
So the model is just mean-estimate with a fixed-variance (the ~1
) across the range of the xvar
, and there are no complications from higher order parameters like a Box-Cox model. (And I'm unable to explain why this is not the same as the plotted centiles. For that you probably need to correspond with the package authors.)
这篇关于从gamlss百分位曲线获取百分位值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!