线性预测器-有序概率(ordinal,clm) [英] linear predictor - ordered probit (ordinal, clm)

查看:219
本文介绍了线性预测器-有序概率(ordinal,clm)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对R中的ordinal软件包或特别是有关predict.clm()函数有疑问.我想计算ordered probit估计的linear predictor.使用MASS软件包的polr功能,可以通过object $ lp访问线性预测变量.它为我提供了每条线的价值,并且与我所理解的线性预测变量X_i'beta一致.但是,如果我在clm()的有序概率估计上使用predict.clm(object, newdata,"linear.predictor"),则会得到一个包含元素eta1和eta2的列表,

I have got a question regarding the ordinal package in R or specifically regarding the predict.clm() function. I would like to calculate the linear predictor of an ordered probit estimation. With the polr function of the MASS package the linear predictor can be accessed by object$lp. It gives me on value for each line and is in line with what I understand what the linear predictor is namely X_i'beta. If I however use the predict.clm(object, newdata,"linear.predictor") on an ordered probit estimation with clm() I get a list with the elements eta1 and eta2,

  1. 如果新数据包含因变量
  2. ,则各为一列
  3. 如果新数据不包含因变量,则每个元素包含的列数与因变量中的层数相同.
  1. with one column each, if the newdata contains the dependent variable
  2. where each element contains as many columns as levels in the dependent variable, if the newdata doesn't contain the dependent variable

不幸的是,我不知道这意味着什么.在作者的文档和论文中,我也没有找到任何有关它的信息.你们中的一位能很好地启发我吗?太好了

Unfortunately I don't have a clue what that means. Also in the documentations and papers of the author I don't find any information about it. Would one of you be so nice to enlighten me? This would be great.

干杯

AK

推荐答案

更新(注释后):

基本clm模型的定义如下(请参见 clm教程以获取详细信息):

Basic clm model is defined like this (see clm tutorial for details):

生成数据:

library(ordinal)
set.seed(1)
test.data = data.frame(y=gl(4,5),
                       x=matrix(c(sample(1:4,20,T)+rnorm(20), rnorm(20)), ncol=2))
head(test.data) # two independent variables 
test.data$y     # four levels in y

构建模型:

fm.polr <- polr(y ~ x) # using polr
fm.clm  <- clm(y ~ x)  # using clm

现在我们可以访问thetasbetas(请参见上面的公式):

Now we can access thetas and betas (see formula above):

# Thetas
fm.polr$zeta # using polr
fm.clm$alpha # using clm
# Betas
fm.polr$coefficients # using polr
fm.clm$beta          # using clm

获得线性预测变量(仅在公式右侧没有theta的部分):

Obtaining linear predictors (only parts without theta on the right side of the formula):

fm.polr$lp                                                 # using polr
apply(test.data[,2:3], 1, function(x) sum(fm.clm$beta*x))  # using clm

新数据生成:

# Contains only independent variables
new.data <- data.frame(x=matrix(c(rnorm(10)+sample(1:4,10,T), rnorm(10)), ncol=2))
new.data[1,] <- c(0,0)  # intentionally for demonstration purpose
new.data

clm模型有四种类型的预测.我们对type=linear.prediction感兴趣,它返回带有两个矩阵的列表:eta1eta2.它们包含new.data中每个观察值的线性预测变量:

There are four types of predictions available for clm model. We are interested in type=linear.prediction, which returns a list with two matrices: eta1 and eta2. They contain linear predictors for each observation in new.data:

lp.clm <- predict(fm.clm, new.data, type="linear.predictor")
lp.clm

注1: eta1eta2字面上是相等的.其次是j索引中的eta1旋转1.因此,它们分别使线性预测标度的左侧和右侧保持打开状态.

Note 1: eta1 and eta2 are literally equal. Second is just a rotation of eta1 by 1 in j index. Thus, they leave left side and right side of linear predictor scale opened respectively.

all.equal(lp.clm$eta1[,1:3], lp.clm$eta2[,2:4], check.attributes=FALSE)
# [1] TRUE

注释2:new.data中第一行的预测等于thetas(只要我们将该行设置为零).

Note 2: Prediction for first line in new.data is equal to thetas (as far as we set this line to zeros).

all.equal(lp.clm$eta1[1,1:3], fm.clm$alpha, check.attributes=FALSE)
# [1] TRUE

注释3:我们可以手动构造此类预测.例如,对new.data中的第二行的预测:

Note 3: We can manually construct such predictions. For instance, prediction for second line in new.data:

second.line <- fm.clm$alpha - sum(fm.clm$beta*new.data[2,])
all.equal(lp.clm$eta1[2,1:3], second.line, check.attributes=FALSE)
# [1] TRUE

注释4::如果new.data包含响应变量,则predict仅返回指定水平y的线性预测变量.同样,我们可以手动检查它:

Note 4: If new.data contains response variable, then predict returns only linear predictor for specified level of y. Again we can check it manually:

new.data$y <- gl(4,3,length=10)
lp.clm.y <- predict(fm.clm, new.data, type="linear.predictor")
lp.clm.y

lp.manual <- sapply(1:10, function(i) lp.clm$eta1[i,new.data$y[i]])
all.equal(lp.clm.y$eta1, lp.manual)
# [1] TRUE

这篇关于线性预测器-有序概率(ordinal,clm)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆