R中的多项式逻辑回归:nnet程序包中的多项式与mlogit程序包中的mlogit有何不同? [英] multinomial logistic regression in R: multinom in nnet package result different from mlogit in mlogit package?

查看:799
本文介绍了R中的多项式逻辑回归:nnet程序包中的多项式与mlogit程序包中的mlogit有何不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R函数,multinom(程序包nnet)和mlogit(程序包mlogit)都可以用于多项逻辑回归.但是为什么这个示例返回不同的系数p值结果?

Both R functions, multinom (package nnet) and mlogit (package mlogit) can be used for multinomial logistic regression. But why this example returns different result of p values of coefficients?

#prepare data

#prepare data

mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$gre[1:10] = rnorm(10,mean=80000)

#multinom:

#multinom:

test = multinom(admit ~ gre + gpa + rank, data = mydata)
z <- summary(test)$coefficients/summary(test)$standard.errors
# For simplicity, use z-test to approximate t test.
pv <- (1 - pnorm(abs(z)))*2 
pv
# (Intercept)         gre         gpa       rank2       rank3       rank4 
# 0.00000000  0.04640089  0.00000000  0.00000000  0.00000000  0.00000000 

#mlogit:

#mlogit:

mldata = mlogit.data(mydata,choice = 'admit', shape = "wide")

mlogit.model1 <- mlogit(admit ~ 1 | gre + gpa + rank, data = mldata)
summary(mlogit.model1)
# Coefficients :
#   Estimate  Std. Error t-value  Pr(>|t|)    
# 1:(intercept) -3.5826e+00  1.1135e+00 -3.2175 0.0012930 ** 
#   1:gre          1.7353e-05  8.7528e-06  1.9825 0.0474225 *  
#   1:gpa          1.0727e+00  3.1371e-01  3.4195 0.0006274 ***
#   1:rank2       -6.7122e-01  3.1574e-01 -2.1258 0.0335180 *  
#   1:rank3       -1.4014e+00  3.4435e-01 -4.0697 4.707e-05 ***
#   1:rank4       -1.6066e+00  4.1749e-01 -3.8482 0.0001190 ***

为什么multinormmlogit的p值如此不同?我想这是因为我使用mydata$gre[1:10] = rnorm(10,mean=80000)添加的异常值.如果离群值是不可避免的问题(例如在基因组学,代谢组学等方面),我应该使用哪个R函数?

Why the p values from multinorm and mlogit are so different? I guess it is because of the outliers I added using mydata$gre[1:10] = rnorm(10,mean=80000). If outlier is an inevitable issue (for example in genomics, metabolomics, etc.), which R function should I use?

推荐答案

此处的区别是Wald $ z $检验(您在pv中计算出的结果)和似然比检验(由summary(mlogit.model).Wald检验在计算上更简单,但通常具有较不理想的属性(例如,其配置项不是定标不变的).您可以阅读有关这两个过程的更多信息

The difference here is the difference between the Wald $z$ test (what you calculated in pv) and the Likelihood Ratio test (what is returned by summary(mlogit.model). The Wald test is computationally simpler, but in general has less desirable properties (e.g., its CIs are not scale-invariant). You can read more about the two procedures here.

要在nnet模型系数上执行LR测试,可以加载carlmtest程序包并调用Anova(test)(尽管对于单个df测试,您需要做更多的工作)

To perform LR tests on your nnet model coefficents, you can load the car and lmtest packages and call Anova(test) (though you'll have to do a little more work for the single df tests).

这篇关于R中的多项式逻辑回归:nnet程序包中的多项式与mlogit程序包中的mlogit有何不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆