r中具有任意系数的predict() [英] predict() with arbitrary coefficients in r

查看:64
本文介绍了r中具有任意系数的predict()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于非r用户设置的logit模型,我有一些系数.我想将这些系数导入r并在与我自己的模型相同的数据集(ROC和混淆矩阵)上生成一些拟合估计.我的第一个想法是使用类似

I've got some coefficients for a logit model set by a non-r user. I'd like to import those coefficients into r and generate some goodness of fit estimates on the same dataset (ROC and confusion matrix) vs my own model. My first thought was to coerce the coefficients into an existing GLM object using something like

summary(fit)$ coefficients [,1]<-y

summary(fit)$coefficients[,1] <- y

汇总(fit)$系数<-x

summary(fit)$coefficients <- x

其中y和x是包含我要用来预测和拟合的系数的矩阵,它是先前创建的适合数据集的虚拟glm对象.当然,这只会给我带来错误.

where y and x are matrices containing the coefficients I'm trying to use to predict and fit is a previously created dummy glm object fit to the dataset. Of course, this gives me only errors.

是否有任何方法可以将任意系数向量传递给predict()函数或在模型中指定系数?我可以通过在GLM中将向量传递给offset参数来以某种方式强制执行此操作吗?谢谢

Is there any way to pass an arbitrary coefficient vector to the predict() function or to specify coefficients in a model? Can I somehow force this by passing a vector into the offset argument in GLM? Thanks

如评论中所述,使用任意系数没有太多的统计依据.我有一个商业伙伴,他/她相信他/她知道"正确的系数,并且我正在尝试根据那些估计值与适当模型生成的系数来量化预测能力的损失.

As mentioned in the comments, there's not much statistical basis for using the arbitrary coefficients. I have a business partner who believes he/she 'knows' the right coefficients and I'm trying to quantify the loss of predictive power based on those estimates versus the coefficients generated by a proper model.

Edit2:Per BondedDust的回答是,我能够强制转换系数,但是由于强制转换而无法清除predict()返回的错误消息,predict.lm似乎由predict调用. ,还会查看系数的等级,这就是导致错误的原因.

Per BondedDust's answer, I was able to coerce the coefficients, however wasn't able to clear the error messages that predict() returned due to the coercion, it would appear that predict.lm, which is called by predict, also looks at the rank of the coefficients and that is causing the error.

推荐答案

这不是您发布的问题的答案-BondedDust回答了该问题-而是描述了自己计算预测概率的另一种方法,在这种情况下可能会有所帮助. /p>

This is not an answer to your posted question - which BondedDust answered - but describes an alternate way in calculating the predicted probabilities yourself which might help in this case.

# Use the mtcars dataset for a minimum worked example
data(mtcars)

# Run a logistic regression and get predictions 
mod <- glm(vs ~ mpg + factor(gear) + factor(am), mtcars, family="binomial")
p1 <- predict(mod, type="response")

# Calculate predicted probabilities manually
m <- model.matrix(~ mpg + factor(gear) + factor(am), mtcars)[,]
p2 <- coef(mod) %*% t(m)
p2 <- plogis(p2)

all(p1 == p2)
#identical(as.numeric(p1), as.numeric(p2))

您可以用给定系数的矢量替换coef(mod). model.matrix将生成计算所需的虚拟变量-检查顺序与系数向量的顺序相同.

You can replace coef(mod) with the vector of coefficients given to you. model.matrix will generate the dummy variables required for the calculation - check that the ordering is the same as that of the coefficient vector.

这篇关于r中具有任意系数的predict()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆