多层次建模的协方差结构 [英] covariance structure for multilevel modelling
问题描述
我有一个多级重复测量数据集,其中约有300位患者,每组最多有10个重复测量可预测肌钙蛋白升高。数据集中还有其他变量,但这里没有包括它们。
我正在尝试使用 nlme
创建一个随机斜率,随机截距模型,其中患者之间的影响不同,时间的影响在不同患者中也不同。当我尝试引入一阶协方差结构以允许由于时间而导致测量值的相关性时,我收到以下错误消息。
`coef<-。corARMA`(`* tmp *`,value = value [parMap [,i]]))错误:系数矩阵不可逆
我已经包含了我的代码和数据集的样本,对于任何明智的话,我将非常感谢。
#baseline模型仅包含截距。随机斜率-截距因患者而异
随机拦截<-lme(肌钙蛋白〜1,
数据= df,随机=〜1 | record_id,方法= ML,
na.action = na.exclude,
控制= list(opt = optim))
#随机截距和固定时间
timeri<-update(randomintercept ,.〜。 + day)
#随机斜率和截距:不同人的时间影响不同
计时器<-update(timeri,random =〜day | record_id)
#model协方差结构。 corAR1()一阶自回归协方差结构,时间点等距
armodel<-update(timers,correlation = corAR1(0,form =〜day | record_id))
coef<-。corARMA中的错误`(`* tmp *`,value = value [parMap [,i]]):系数矩阵不可逆
数据:
record_id日的肌钙蛋白
1 1 32
2 0 NA
2 1不适用
2 2不适用
2 3 8
2 4 6
2 5 7
2 6 7
2 7 7
2 8不适用
2 9 9
3 0 14
3 1 1167
3 2 1935
4 0 19
4 1 16
4 2 29
5 0不适用
5 1 17
5 2 47
5 3684
6 0 46
6 1 45440
6 2 47085
7 0 48
7 1 87
7 2 44
7 3 20
7 4 15
7 5 11
7 6 10
7 7 11
7 8197
8 0 28
8 1 31
9 0不适用
9 1204
10 0不适用
10 1 19
如果您将优化器更改为 nlminb(或者至少与您发布的精简数据集一起使用,则可以适合) )。
armodel<-update(定时器,
相关性= corAR1(0,形式=〜day | record_id ),
control = list(opt = nlminb))
但是,如果您查看拟合模型,您会发现自己有问题-估计的AR1参数为-1,随机截距和斜率项与r = 0.998相关。
我认为问题在于数据的性质。多数数据似乎在10-50范围内,但存在一到两个数量级的偏移(例如,单个6个,最高约45000)。很难使模型适合这种尖峰数据。我会强烈建议对您的数据进行日志转换;标准诊断图( plot(randomintercept)
)如下:
有点合理,尽管仍有一些证据表明存在异方差性。
AR +随机斜率模型符合OK:
ar.rlog<-update(rlog,
random =〜day | record_id,
相关= corAR1(0,form =〜day | record_id))
##线性混合效应模型通过最大似然
##拟合.. 。
##随机效果:
##公式:〜day | record_id
##结构:一般正定,对数乔列斯基参数化
## StdDev Corr
##(拦截)0.1772409(Intr)
##天0.6045765 0.992
##残差0.4771523
##
##关联结构:ARMA(1,0)
##公式:〜day | record_id
##参数估计值:
## Phi1
## 0.09181557
## ...
快速浏览 intervals(ar.rlog)
可以看到自回归参数的置信区间为(- 0.52,0.65),因此可能不值得保留...
由于模型中的随机斜率,异方差似乎不再是问题所在...
图(rlog,sqrt(abs(resid(。)))〜fitted(。),type = c( p, smooth ))
I have a multilevel repeated measures dataset of around 300 patients each with up to 10 repeated measures predicting troponin rise. There are other variables in the dataset, but I haven't included them here.
I am trying to use nlme
to create a random slope, random intercept model where effects vary between patients, and effect of time is different in different patients. When I try to introduce a first-order covariance structure to allow for the correlation of measurements due to time I get the following error message.
Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible
I have included my code and a sample of the dataset, and I would be very grateful for any words of wisdom.
#baseline model includes only the intercept. Random slopes - intercept varies across patients
randomintercept <- lme(troponin ~ 1,
data = df, random = ~1|record_id, method = "ML",
na.action = na.exclude,
control = list(opt="optim"))
#random intercept and time as fixed effect
timeri <- update(randomintercept,.~. + day)
#random slopes and intercept: effect of time is different in different people
timers <- update(timeri, random = ~ day|record_id)
#model covariance structure. corAR1() first order autoregressive covariance structure, timepoints equally spaced
armodel <- update(timers, correlation = corAR1(0, form = ~day|record_id))
Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible
Data:
record_id day troponin
1 1 32
2 0 NA
2 1 NA
2 2 NA
2 3 8
2 4 6
2 5 7
2 6 7
2 7 7
2 8 NA
2 9 9
3 0 14
3 1 1167
3 2 1935
4 0 19
4 1 16
4 2 29
5 0 NA
5 1 17
5 2 47
5 3 684
6 0 46
6 1 45440
6 2 47085
7 0 48
7 1 87
7 2 44
7 3 20
7 4 15
7 5 11
7 6 10
7 7 11
7 8 197
8 0 28
8 1 31
9 0 NA
9 1 204
10 0 NA
10 1 19
You can fit this if you change your optimizer to "nlminb" (or at least it works with the reduced data set you posted).
armodel <- update(timers,
correlation = corAR1(0, form = ~day|record_id),
control=list(opt="nlminb"))
However, if you look at the fitted model, you'll see you have problems - the estimated AR1 parameter is -1 and the random intercept and slope terms are correlated with r=0.998.
I think the problem is with the nature of the data. Most of the data seem to be in the range 10-50, but there are excursions by one or two orders of magnitude (e.g. individual 6, up to about 45000). It might be hard to fit a model to data this spiky. I would strongly suggest log-transforming your data; the standard diagnostic plot (plot(randomintercept)
) looks like this:
whereas fitting on the log scale
rlog <- update(randomintercept,log10(troponin) ~ .)
plot(rlog)
is somewhat more reasonable, although there is still some evidence of heteroscedasticity.
The AR+random-slopes model fits OK:
ar.rlog <- update(rlog,
random = ~day|record_id,
correlation = corAR1(0, form = ~day|record_id))
## Linear mixed-effects model fit by maximum likelihood
## ...
## Random effects:
## Formula: ~day | record_id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 0.1772409 (Intr)
## day 0.6045765 0.992
## Residual 0.4771523
##
## Correlation Structure: ARMA(1,0)
## Formula: ~day | record_id
## Parameter estimate(s):
## Phi1
## 0.09181557
## ...
A quick glance at intervals(ar.rlog)
shows that the confidence intervals on the autoregressive parameter are (-0.52,0.65), so it may not be worth keeping ...
With the random slopes in the model the heteroscedasticity no longer seems problematic ...
plot(rlog,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))
这篇关于多层次建模的协方差结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!