从gamlss百分位曲线获取百分位值 [英] Getting percentile values from gamlss centile curves

查看:349
本文介绍了从gamlss百分位曲线获取百分位值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题与以下问题有关:使用gamlss :: lms在R

This question is related to: Selecting Percentile curves using gamlss::lms in R

我可以从以下数据和代码中获得百分曲线:

I can get centile curve from following data and code:

age = sample(5:15, 500, replace=T) 
yvar = rnorm(500, age, 20)
mydata = data.frame(age, yvar)
head(mydata)
  age      yvar
1  12  13.12974
2  14 -18.97290
3  10  42.11045
4  12  27.89088
5  11  48.03861
6   5  24.68591

h = lms(yvar, age , data=mydata, n.cyc=30)
centiles(h,xvar=mydata$age, cent=c(90), points=FALSE)

我现在如何获得x值(5:15)中的每一个的yvar来表示平滑后是否有90%的数据?

How can I now get yvar on the curve for each of x value (5:15) which would represent 90th percentiles for data after smoothening?

我尝试阅读帮助页面,并找到fit(h)和fv(h)来获取整个数据的拟合值。但是,如何在第90个百分位数曲线水平上获取每个年龄水平的值?谢谢您的帮助。

I tried to read help pages and found fitted(h) and fv(h) to get fitted values for entire data. But how to get values for each age level at 90th centile curve level? Thanks for your help.

编辑:下图显示了我的需要:

Following figure show what I need:

我尝试了以下操作,但由于值不正确,因此是正确的:

I tried following but it is correct since value are incorrect:

mydata$fitted = fitted(h)
aggregate(fitted~age, mydata, function(x) quantile(x,.9))
   age    fitted
1    5  6.459680
2    6  6.280579
3    7  6.290599
4    8  6.556999
5    9  7.048602
6   10  7.817276
7   11  8.931219
8   12 10.388048
9   13 12.138104
10  14 14.106250
11  15 16.125688

该值与直接根据数据得出的第90个分位数有很大不同:

The values are very different from 90th quantile directly from data:

> aggregate(yvar~age, mydata, function(x) quantile(x,.9))
   age     yvar
1    5 39.22938
2    6 35.69294
3    7 25.40390
4    8 26.20388
5    9 29.07670
6   10 32.43151
7   11 24.96861
8   12 37.98292
9   13 28.28686
10  14 43.33678
11  15 44.46269


推荐答案

看看这是否有意义。均值和标准差为``smn''和``ssd''的正态分布的第90个百分位数是 qnorm(.9,smn,ssd):所以这似乎可以实现(有点)的结果,尽管我建议的百分位数并非完全破解:

See if this makes sense. The 90th percentile of a normal distribution with mean and sd of 'smn' and 'ssd' is qnorm(.9, smn, ssd): So this seems to deliver (somewhat) sensible results, albeit not the full hack of centiles that I suggested:

 plot(h$xvar, qnorm(.9, fitted(h), h$sigma.fv))

(请注意,只有少数几个不同的xvar却有500个点的大量绘图。Ande您可能希望设置ylim,以便可以欣赏整个范围。)

(Note the massive overplotting from only a few distinct xvars but 500 points. Ande you may want to set the ylim so that the full range can be appreciated.)

此处的警告是您需要检查模型的其他部分,以查看它是否真的只是普通的普通模型。在这种情况下,似乎是:

The caveat here is that you need to check the other parts of the model to see if it is really just an ordinary Normal model. In this case it seems to be:

> h$mu.formula
y ~ pb(x)
<environment: 0x10275cfb8>
> h$sigma.formula
~1
<environment: 0x10275cfb8>
> h$nu.formula
NULL
> h$tau.formula
NULL

所以模型只是固定的均值估计-$ xvar 范围内的-variance(〜1 ),并且像这样的高阶参数不会带来任何并发症Box-Cox模型。 (而且我无法解释为什么这与绘制的百分位数不同。为此,您可能需要与软件包作者相对应。)

So the model is just mean-estimate with a fixed-variance (the ~1) across the range of the xvar, and there are no complications from higher order parameters like a Box-Cox model. (And I'm unable to explain why this is not the same as the plotted centiles. For that you probably need to correspond with the package authors.)

这篇关于从gamlss百分位曲线获取百分位值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆