在Stan中开发非线性增长曲线模型的分层版本 [英] Developing Hierarchical Version of Nonlinear Growth Curve Model in Stan

查看:122
本文介绍了在Stan中开发非线性增长曲线模型的分层版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下模型是Preece and Baines(1978年,人类生物学年鉴)的模型1,用于描述人类的成长.

The following model is model 1 of Preece and Baines (1978, Annals of Human Biology), and is used to describe human growth.

我对此模型的Stan代码如下:

My Stan code for this model is as follows:

```{stan output.var="test"}
 data {
  int<lower=1> n;
  ordered[n] t; // age
  ordered[n] y; // height of human
 }

 parameters {
  positive_ordered[2] h;
  real<lower=0, upper=t[n-1]>theta;
  positive_ordered[2] s; 
  real<lower=0>sigma;
 }

 model {
  h[1] ~ uniform(0, y[n]);
  h[2] ~ normal(180, 20);
  sigma ~ student_t(2, 0, 1);
  s[1] ~ normal(5, 5);
  s[2] ~ normal(5, 5);
  theta ~ normal(10, 5);

  y ~ normal(h[2] - (2*(h[2] - h[1]) * inv(exp(s[1]*(t - theta)) + exp(s[2]*(t - theta)))), sigma);
 }
```

我现在想创建此模型的分层版本,其参数建模在男孩之间相同(例如,测量变异性sigma),或者在分层版本中(例如,h_1,成人身高).

I now want to create a hierarchical version of this model, with parameters modelled either identical across boys (such as the measurement variability sigma) or in a hierarchical version (such as h_1, the adult height).

对于参数sigmatheta,我想在男孩中保持这些先验值与sigma ~ student_t(2, 0, 1);theta ~ normal(10, 5);相同.

As for the parameters sigma and theta, I want to maintain these priors as identical across boys as sigma ~ student_t(2, 0, 1); and theta ~ normal(10, 5);.

不幸的是,我几乎没有实施层次建模的经验,而且我在尝试做贝叶斯教科书中简单的二项式层次模型之外的任何层次示例方面都感到吃力(请参见 Statistical的第12章,第358-359页)由Richard McElreath重新思考.但是,我确实了解分层建模背后的理论,这在安德鲁·盖尔曼(Andrew Gelman)的贝叶斯数据分析(Bayesian Data Analysis)的第5章和理查德·麦克埃尔瑞斯(Richard McElreath)的<统计>重新思考(emsistical Rethinking)的第12章中都有所描述.

Unfortunately, I have almost no experience in implementing hierarchical modelling, and I have struggled in my attempts at doing any hierarchical examples beyond the simple binomial hierarchical models in Bayesian textbooks (see chapter 12, pages 358-359 of Statistical Rethinking by Richard McElreath). I do, however, understand the theory behind hierarchical modelling, as written in chapter 5 of Bayesian Data Analysis by Andrew Gelman and chapter 12 of Statistical Rethinking by Richard McElreath.

如果人们能抽出一些时间来演示如何在Stan中实现这种类型的层次模型,我将不胜感激.如果要求不高,我将不胜感激您可以在代码旁边提供任何解释,以便将来我可以学习如何独立实现这些类型的层次模型示例.

I would appreciate it if people could please take the time to demonstrate how this type of hierarchical model would be implemented in Stan. If it isn't too much to ask, I would greatly appreciate any explanations you could provide alongside the code, so that I may learn how to implement these types of hierarchical model examples independently in the future.

我的数据示例如下:

    age boy01 boy02 boy03 boy04 boy05 boy06 boy07 boy08 boy09 boy10 boy11 boy12 boy13 boy14 boy15 boy16 boy17 boy18
 1  1     81.3  76.2  76.8  74.1  74.2  76.8  72.4  73.8  75.4  78.8  76.9  81.6  78    76.4  76.4  76.2  75    79.7
 2  1.25  84.2  80.4  79.8  78.4  76.3  79.1  76    78.7  81    83.3  79.9  83.7  81.8  79.4  81.2  79.2  78.4  81.3
 3  1.5   86.4  83.2  82.6  82.6  78.3  81.1  79.4  83    84.9  87    84.1  86.3  85    83.4  86    82.3  82    83.3
 4  1.75  88.9  85.4  84.7  85.4  80.3  84.4  82    85.8  87.9  89.6  88.5  88.8  86.4  87.6  89.2  85.4  84    86.5
 5  2     91.4  87.6  86.7  88.1  82.2  87.4  84.2  88.4  90    91.4  90.6  92.2  87.1  91.4  92.2  88.4  85.9  88.9
 6  3    101.   97    94.2  98.6  89.4  94    93.2  97.3  97.3 100.   96.6  99.3  96.2 101.  101.  101    95.6  99.4
 7  4    110.  105.  100.  104.   96.9 102.  102.  107.  103.  111   105.  106.  104   106.  110.  107.  102.  104. 
 8  5    116.  112.  107.  111   104.  109.  109   113.  108.  118.  112   113.  111   113.  117.  115.  109.  112. 
 9  6    122.  119.  112.  116.  111.  116.  117.  119.  114.  126.  119.  120.  117.  120.  122.  121.  118.  119  
10  7    130   125   119.  123.  116.  122.  123.  126.  120.  131.  125.  127.  124.  129.  130.  128   125.  128  

我很抱歉在小数点后位数缺乏精确度.数据以小节表的形式出现,它似乎并未响应R的通常命令以实现更高的精度.为了保持一致性,最好忽略第5行之后的所有行,因为第1-5行显示了原始数据中存在的完整精度.

My apologies for the lack of precision in the decimal places. The data is in the form of a tibble table, which doesn't seem to respond to R's usual commands for greater precision. For the sake of consistency, it might be better to simply ignore all of the rows after row 5, since rows 1 - 5 display the full precision that is present in the original data.

在完整数据中,年龄是

> Children$age
 [1]  1.00  1.25  1.50  1.75  2.00  3.00  4.00  5.00  6.00  7.00  8.00  8.50  9.00  9.50 10.00 10.50 11.00 11.50 12.00 12.50
[21] 13.00 13.50 14.00 14.50 15.00 15.50 16.00 16.50 17.00 17.50 18.00

有39个男孩,以与上述示例相同的宽数据格式列出.

And there are 39 boys, listed in the same wide-data format as the above sample.

推荐答案

免责声明:首先,让我们使用Stan拟合(非分层)非线性增长模型.

Disclaimer: As a start let's fit a (non-hierarchical) non-linear growth model using Stan.

  1. 我们读取了示例数据.

  1. We read in the sample data.

library(tidyverse);
df <- read.table(text = "
age boy01 boy02 boy03 boy04 boy05 boy06 boy07 boy08 boy09 boy10 boy11 boy12 boy13 boy14 boy15 boy16 boy17 boy18
1  1     81.3  76.2  76.8  74.1  74.2  76.8  72.4  73.8  75.4  78.8  76.9  81.6  78    76.4  76.4  76.2  75    79.7
2  1.25  84.2  80.4  79.8  78.4  76.3  79.1  76    78.7  81    83.3  79.9  83.7  81.8  79.4  81.2  79.2  78.4  81.3
3  1.5   86.4  83.2  82.6  82.6  78.3  81.1  79.4  83    84.9  87    84.1  86.3  85    83.4  86    82.3  82    83.3
4  1.75  88.9  85.4  84.7  85.4  80.3  84.4  82    85.8  87.9  89.6  88.5  88.8  86.4  87.6  89.2  85.4  84    86.5
5  2     91.4  87.6  86.7  88.1  82.2  87.4  84.2  88.4  90    91.4  90.6  92.2  87.1  91.4  92.2  88.4  85.9  88.9
6  3    101.   97    94.2  98.6  89.4  94    93.2  97.3  97.3 100.   96.6  99.3  96.2 101.  101.  101    95.6  99.4
7  4    110.  105.  100.  104.   96.9 102.  102.  107.  103.  111   105.  106.  104   106.  110.  107.  102.  104.
8  5    116.  112.  107.  111   104.  109.  109   113.  108.  118.  112   113.  111   113.  117.  115.  109.  112.
9  6    122.  119.  112.  116.  111.  116.  117.  119.  114.  126.  119.  120.  117.  120.  122.  121.  118.  119
10  7    130   125   119.  123.  116.  122.  123.  126.  120.  131.  125.  127.  124.  129.  130.  128   125.  128", header = T, row.names = 1);
df <- df %>%
    gather(boy, height, -age);

  • 我们定义了Stan模型.

  • We define the Stan model.

    model <- "
    data {
        int N;                                       // Number of observations
        real y[N];                                   // Height
        real t[N];                                   // Time
    }
    
    parameters {
        real<lower=0,upper=1> s[2];
        real h_theta;
        real theta;
        real sigma;
    }
    
    transformed parameters {
        vector[N] mu;
        real h1;
        h1 = max(y);
        for (i in 1:N)
            mu[i] = h1 - 2 * (h1 - h_theta) / (exp(s[1] * (t[i] - theta)) + (exp(s[2] * (t[i] - theta))));
    }
    
    model {
        // Priors
        s ~ cauchy(0, 2.5);
    
        y ~ normal(mu, sigma);
    }
    "
    

    在这里,我们考虑对s[1]s[2]进行弱信息化(正则化)先验.

    Here we consider weakly informative (regularising) priors on s[1] and s[2].

    使模型适合数据.

    library(rstan);
    options(mc.cores = parallel::detectCores());
    rstan_options(auto_write = TRUE);    
    model <- stan(
        model_code = model,
        data = list(
            N = nrow(df),
            y = df$height,
            t = df$age),
        iter = 4000);
    

  • 6个模型参数的摘要:

  • Summary of the 6 model parameters:

    summary(model, pars = c("h1", "h_theta", "theta", "s", "sigma"))$summary 
    #               mean     se_mean        sd        2.5%         25%         50%
    #h1      131.0000000 0.000000000 0.0000000 131.0000000 131.0000000 131.0000000
    #h_theta 121.6874553 0.118527828 2.7554944 115.4316738 121.1654809 122.2134014
    #theta     6.5895553 0.019738319 0.5143429   5.4232740   6.4053479   6.6469534
    #s[1]      0.7170836 0.214402086 0.3124318   0.1748077   0.3843143   0.8765256
    #s[2]      0.3691174 0.212062373 0.3035039   0.1519308   0.1930381   0.2066811
    #sigma     3.1524819 0.003510676 0.1739904   2.8400096   3.0331962   3.1411533
    #                75%       97.5%       n_eff     Rhat
    #h1      131.0000000 131.0000000 8000.000000      NaN
    #h_theta 123.0556379 124.3928800  540.453594 1.002660
    #theta     6.8790801   7.3376348  679.024115 1.002296
    #s[1]      0.9516115   0.9955989    2.123501 3.866466
    #s[2]      0.2954409   0.9852540    2.048336 6.072550
    #sigma     3.2635849   3.5204101 2456.231113 1.001078
    

    那是什么意思?从s[1]s[2]Rhat值中,您可以看到这两个参数存在收敛问题.这是因为s[1]s[2]是无法区分的:无法同时估计它们.在s[1]s[2]上进行更强的先验正则化可能会将s参数之一驱动为零.

    So what does this mean? From the Rhat values for s[1] and s[2] you can see that there are issues with convergence for these two parameters. This is due to the fact that s[1] and s[2] are indistinguishable: they cannot be estimated both at the same time. A stronger regularising prior on s[1] and s[2] would probably drive one of the s parameters to zero.

    我不确定我是否理解s[1]s[2]的要点.从统计建模的角度来看,您无法在我们正在考虑的简单非线性增长模型中获得两个参数的估计值.

    I'm not sure I understand the point of s[1] and s[2]. From a statistical modelling point of view, you cannot obtain estimates for both parameters in the simple non-linear growth model that we're considering.

    正如这里所承诺的那样,是一个更新.这变成了一篇很长的文章,我试图通过添加其他解释来使事情变得尽可能清晰.

    As promised here is an update. This is turning into quite a long post, I've tried to make things as clear as possible by adding additional explanations.

    1. 使用positive_ordered作为s的数据类型在解决方案的收敛方面有很大的不同.我不清楚为什么会这样,我也不知道Stan是如何实现positive_ordered的,但是它可以工作.

    1. Using positive_ordered as data type for s makes a significant difference in terms of convergence of solutions. It's not clear to me why that is the case, nor do I know how Stan implements positive_ordered, but it works.

    在这种分层方法中,我们通过考虑h1 ~ normal(mu_h1, sigma_h1)来部分合并所有男孩的身高数据,其中超参数mu_h1 ~ normal(max(y), 10)为先验,而sigma_h1 ~ cauchy(0, 10)则为半主动(因为它sigma声明为real<lower=0>).

    In this hierarchical approach, we partially pool height data across all boys by considering h1 ~ normal(mu_h1, sigma_h1), with priors on the hyperparameters mu_h1 ~ normal(max(y), 10) and a half-Cauchy prior on sigma_h1 ~ cauchy(0, 10) (it's half-Cauchy because sigma is declared as real<lower=0>).

    说实话,我不确定某些参数的解释(和可解释性). h_1h_theta的估算值非常相似,并且以某种方式彼此抵消.我可以想象,这在拟合模型时会产生一些收敛性问题,但是正如您进一步看到的那样,Rhat值似乎还可以.不过,由于我对模型,数据及其上下文了解得不够多,因此我对这些参数的可解释性仍持怀疑态度.从统计建模的角度来看,通过将其他一些参数转换为组级参数来扩展模型非常简单;但是我认为困难将来自于难以区分和缺乏可解释性.

    To be honest, I am unsure about the interpretation (and interpretability) of some of the parameters. Estimates for h_1 and h_theta are very similar, and in some way cancel each other out. I would imagine that this creates some convergence issues when fitting the model, but as you can see further down, Rhat values seem ok. Still, as I don't know enough about the model, data and its context, I remain somewhat skeptical as to the interpretability of those parameters. Extending the model by turning some of the other parameters into group-level parameters is straightforward from a statistical modelling point of view; however I imagine that difficulties will arise from the indistinguishability and lack of interpretability.

    撇开所有这些要点,这应该为您提供有关如何实现分层模型的实际示例.

    All these points aside, this should give you a practical example of how to implement a hierarchical model.

    model_code <- "
    data {
        int N;                                       // Number of observations
        int J;                                       // Number of boys
        int<lower=1,upper=J> boy[N];                 // Boy of observation
        real y[N];                                   // Height
        real t[N];                                   // Time
    }
    
    parameters {
        real<lower=0> h1[J];
        real<lower=0> h_theta;
        real<lower=0> theta;
        positive_ordered[2] s;
        real<lower=0> sigma;
    
        // Hyperparameters
        real<lower=0> mu_h1;
        real<lower=0> sigma_h1;
    }
    
    transformed parameters {
        vector[N] mu;
    
        for (i in 1:N)
            mu[i] = h1[boy[i]] - 2 * (h1[boy[i]] - h_theta) / (exp(s[1] * (t[i] - theta)) + (exp(s[2] * (t[i] - theta))));
    }
    
    model {
        h1 ~ normal(mu_h1, sigma_h1);          // Partially pool h1 parameters across boys
    
        mu_h1 ~ normal(max(y), 10);            // Prior on h1 hyperparameter mu
        sigma_h1 ~ cauchy(0, 10);              // Half-Cauchy prior on h1 hyperparameter sigma
        h_theta ~ normal(max(y), 2);           // Prior on h_theta
        theta ~ normal(max(t), 2);             // Prior on theta
        s ~ cauchy(0, 1);                      // Half-Cauchy priors on s[1] and s[2]
    
        y ~ normal(mu, sigma);
    }
    "
    

    总结:我们通过将成人身高参数建模为h1 ~ normal(mu_h1, sigma_h1)来汇总所有男孩的身高数据,以提高组(即男孩)水平的估计,其中超参数mu_h1sigma_h1表征了正态分布所有男孩的成人身高值.我们在超参数上选择弱信息先验,并在所有其余参数上选择弱信息先验,类似于第一个完整池示例.

    To summarise: We pool height data from all boys to improve estimates at the group (i.e. boy) level, by modelling the adult height parameter as h1 ~ normal(mu_h1, sigma_h1), where the hyperparameters mu_h1 and sigma_h1 characterise the normal distribution of adult height values across all boys. We choose weakly informative priors on the hyperparameters, and choose weakly informative priors on all remaining parameters similar to the first complete-pooling example.

    # Fit model
    fit <- stan(
        model_code = model_code,
        data = list(
            N = nrow(df),
            J = length(unique(df$boy)),
            boy = df$boy,
            y = df$height,
            t = df$age),
        iter = 4000)
    

    摘录摘要

    我们提取所有参数的参数估计值;请注意,我们现在拥有的h1参数与组(即男孩)一样多.

    Extract summary

    We extract parameter estimates for all parameters; note that we now have as many h1 parameters as there are groups (i.e. boys).

    # Get summary
    summary(fit, pars = c("h1", "h_theta", "theta", "s", "sigma"))$summary
    #               mean      se_mean         sd        2.5%         25%        50%
    #h1[1]   142.9406153 0.1046670943 2.41580757 138.4272280 141.2858391 142.909765
    #h1[2]   143.7054020 0.1070466445 2.46570025 139.1301456 142.0233342 143.652657
    #h1[3]   144.0352331 0.1086953809 2.50145442 139.3982034 142.3131167 143.971473
    #h1[4]   143.8589955 0.1075753575 2.48015745 139.2689731 142.1666685 143.830347
    #h1[5]   144.7359976 0.1109871908 2.55284812 140.0529359 142.9917503 144.660586
    #h1[6]   143.9844938 0.1082691127 2.49497990 139.3378948 142.2919990 143.926931
    #h1[7]   144.3857221 0.1092604239 2.51645359 139.7349112 142.6665955 144.314645
    #h1[8]   143.7469630 0.1070594855 2.46860328 139.1748700 142.0660983 143.697302
    #h1[9]   143.6841113 0.1072208284 2.47391295 139.0885987 141.9839040 143.644357
    #h1[10]  142.9518072 0.1041206784 2.40729732 138.4289207 141.3114204 142.918407
    #h1[11]  143.5352502 0.1064173663 2.45712021 138.9607665 141.8547610 143.483157
    #h1[12]  143.0941582 0.1050061258 2.42894673 138.5579378 141.4295430 143.055576
    #h1[13]  143.6194965 0.1068494690 2.46574352 138.9426195 141.9412820 143.577920
    #h1[14]  143.4477182 0.1060254849 2.44776536 138.9142081 141.7708660 143.392231
    #h1[15]  143.1415683 0.1049131998 2.42575487 138.6246642 141.5014391 143.102219
    #h1[16]  143.5686919 0.1063594201 2.45328456 139.0064573 141.8962853 143.510276
    #h1[17]  144.0170715 0.1080567189 2.49269747 139.4162885 142.3138300 143.965127
    #h1[18]  143.4740997 0.1064867748 2.45545200 138.8768051 141.7989566 143.426211
    #h_theta 134.3394366 0.0718785944 1.72084291 130.9919889 133.2348411 134.367152
    #theta     8.2214374 0.0132434918 0.45236221   7.4609612   7.9127800   8.164685
    #s[1]      0.1772044 0.0004923951 0.01165119   0.1547003   0.1705841   0.177522
    #s[2]      1.6933846 0.0322953612 1.18334358   0.6516669   1.1630900   1.463148
    #sigma     2.2601677 0.0034146522 0.13271459   2.0138514   2.1657260   2.256678
    #                75%       97.5%     n_eff     Rhat
    #h1[1]   144.4795105 147.8847202  532.7265 1.008214
    #h1[2]   145.2395543 148.8419618  530.5599 1.008187
    #h1[3]   145.6064981 149.2080965  529.6183 1.008087
    #h1[4]   145.4202919 149.0105666  531.5363 1.008046
    #h1[5]   146.3200407 150.0701757  529.0592 1.008189
    #h1[6]   145.5551778 149.1365279  531.0372 1.008109
    #h1[7]   145.9594956 149.5996605  530.4593 1.008271
    #h1[8]   145.3032680 148.8695637  531.6824 1.008226
    #h1[9]   145.2401743 148.7674840  532.3662 1.008023
    #h1[10]  144.4811712 147.9218834  534.5465 1.007937
    #h1[11]  145.1153635 148.5968945  533.1235 1.007988
    #h1[12]  144.6479561 148.0546831  535.0652 1.008115
    #h1[13]  145.1660639 148.6562172  532.5386 1.008138
    #h1[14]  144.9975197 148.5273804  532.9900 1.008067
    #h1[15]  144.6733010 148.1130207  534.6057 1.008128
    #h1[16]  145.1163764 148.6027096  532.0396 1.008036
    #h1[17]  145.5578107 149.2014363  532.1519 1.008052
    #h1[18]  145.0249329 148.4886949  531.7060 1.008055
    #h_theta 135.4870338 137.6753254  573.1698 1.006818
    #theta     8.4812339   9.2516700 1166.7226 1.002306
    #s[1]      0.1841457   0.1988365  559.9036 1.005333
    #s[2]      1.8673249   4.1143099 1342.5839 1.001562
    #sigma     2.3470429   2.5374239 1510.5824 1.001219
    

    可视化成人身高估算值

    最后,我们绘制了所有男孩的成人身高估计值,包括50%和97%的置信区间.

    Visualise adult height estimates

    Finally we plot adult height estimates h_1 for all boys, including 50% and 97% confidence intervals.

    # Plot h1 values
    summary(fit, pars = c("h1"))$summary %>%
        as.data.frame() %>%
        rownames_to_column("Variable") %>%
        mutate(
            Variable = gsub("(h1\\[|\\])", "", Variable),
            Variable = df$key[match(Variable, df$boy)]) %>%
        ggplot(aes(x = `50%`, y = Variable)) +
        geom_point(size = 3) +
        geom_segment(aes(x = `2.5%`, xend = `97.5%`, yend = Variable), size = 1) +
        geom_segment(aes(x = `25%`, xend = `75%`, yend = Variable), size = 2) +
        labs(x = "Median (plus/minus 95% and 50% CIs)", y = "h1")
    

    这篇关于在Stan中开发非线性增长曲线模型的分层版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆