斯坦的数据增强概率回归 [英] Probit regression with data augmentation in stan

查看:113
本文介绍了斯坦的数据增强概率回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用stan进行数据增强的概率模型.这是我们得到结果y或0/1的地方,它告诉我们潜在变量ystar的符号.到目前为止,这就是我所拥有的,但是我不确定如何在model部分中添加有关y的信息.有什么想法吗?

I am attempting to do a probit model with data augmentation using stan. This is where we have outcomes y either 0/1 that tell us the sign of the latent variable ystar. This is what I have so far, but I'm not sure how to add information in the model section about y. Any thoughts?

data {
  int<lower=0> N; // number of obs
  int<lower=0> K; // number of predictors
  int<lower=0,upper=1> y[N]; // outcomes
  matrix[N, K] x; // predictor variables
}
parameters {
  vector[K] beta; // beta coefficients
  vector[N] ystar; // latent variable
}
model {
  vector[N] mu; 
  beta ~ normal(0, 100);
  mu <- x*beta;
  ystar ~ normal(mu, 1);
}

推荐答案

您可以 data { int<lower=0> N; // number of obs int<lower=0> K; // number of predictors vector<lower=-1,upper=1> sign; // y = 0 -> -1, y = 1 -> 1 matrix[N, K] x; // predictor variables } parameters { vector[K] beta; // beta coefficients vector<lower=0>[N] abs_ystar; // latent variable } model { beta ~ normal(0, 100); // ignore the warning about a Jacobian from the parser sign .* abs_ystar ~ normal(x * beta, 1); }

You could do data { int<lower=0> N; // number of obs int<lower=0> K; // number of predictors vector<lower=-1,upper=1> sign; // y = 0 -> -1, y = 1 -> 1 matrix[N, K] x; // predictor variables } parameters { vector[K] beta; // beta coefficients vector<lower=0>[N] abs_ystar; // latent variable } model { beta ~ normal(0, 100); // ignore the warning about a Jacobian from the parser sign .* abs_ystar ~ normal(x * beta, 1); }

也就是说,没有理由在Stan中为二进制概率模型进行数据增强,除非某些结果丢失或有所遗漏.这样做更直接(并将参数空间减小为K而不是K + N) data { int<lower=0> N; // number of obs int<lower=0> K; // number of predictors int<lower=0,upper=1> y[N]; // outcomes matrix[N, K] x; // predictor variables } parameters { vector[K] beta; // beta coefficients } model { vector[N] mu; beta ~ normal(0, 100); mu <- x*beta; for (n in 1:N) mu[n] <- Phi(mu[n]); y ~ bernoulli(mu); } 如果您真正关心潜在的实用程序,则可以通过generated quantities块中的拒绝采样来生成它,如下所示 generated quantities { vector[N] ystar; { vector[N] mu; mu <- x * beta; for (n in 1:N) { real draw; draw <- not_a_number(); if (sign[n] == 1) while(!(draw > 0)) draw <- normal_rng(mu[n], 1); else while(!(draw < 0)) draw <- normal_rng(mu[n], 1); ystar[n] <- draw; } } }

That said, there is no reason to do data augmentation in Stan for a binary probit model, unless some of the outcomes were missing or something. It is more straightforward (and reduces the parameter space to K instead of K + N) to do data { int<lower=0> N; // number of obs int<lower=0> K; // number of predictors int<lower=0,upper=1> y[N]; // outcomes matrix[N, K] x; // predictor variables } parameters { vector[K] beta; // beta coefficients } model { vector[N] mu; beta ~ normal(0, 100); mu <- x*beta; for (n in 1:N) mu[n] <- Phi(mu[n]); y ~ bernoulli(mu); } If you really care about the latent utility, you could generate it via rejection sampling in the generated quantities block, like this generated quantities { vector[N] ystar; { vector[N] mu; mu <- x * beta; for (n in 1:N) { real draw; draw <- not_a_number(); if (sign[n] == 1) while(!(draw > 0)) draw <- normal_rng(mu[n], 1); else while(!(draw < 0)) draw <- normal_rng(mu[n], 1); ystar[n] <- draw; } } }

这篇关于斯坦的数据增强概率回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆