glmnet:如何知道我的回答的哪个因子水平在逻辑回归中被编码为1 [英] glmnet: How do I know which factor level of my response is coded as 1 in logistic regression

查看:89
本文介绍了glmnet:如何知道我的回答的哪个因子水平在逻辑回归中被编码为1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用glmnet软件包制作的逻辑回归模型.我的响应变量被编码为一个因子,我将其级别称为"a"和"b".

I have a logistic regression model that I made using the glmnet package. My response variable was coded as a factor, the levels of which I will refer to as "a" and "b".

逻辑回归的数学将两个类别之一标记为"0",另一个标记为"1".逻辑回归模型的特征系数为正,负或零.如果特征"f"的系数为正,则增加测试观察值x的"f"值会增加模型将x分类为"1"类的可能性.

The mathematics of logistic regression label one of the two classes as "0" and the other as "1". The feature coefficients of a logistic regression model are either positive, negative, or zero. If a feature "f"'s coefficient is positive, then increasing the value of "f" for a test observation x increases the probability that the model classifies x as being of class "1".

我的问题是:给定glmnet模型,您如何知道glmnet如何将数据的因子标签{"a","b"}映射到基础数学的因子标签{"0","1 }?因为您需要知道这一点才能正确解释模型的系数.

My question is: Given a glmnet model, how do you know how glmnet mapped your data's factor labels {"a", "b"} to the underlying mathematics' factor labels {"0", "1"}? Because you need to know that to interpret the model's coefficients properly.

您可以通过尝试将predict函数的输出应用于玩具观察来手动解决此问题.但是,glmnet如何隐式处理该映射以加快解释过程会很好.

You can figure this out manually by experimenting with the output of the predict function when applied to toy observations. But it would be nice to how glmnet implicitly handles that mapping to speed up the interpretation process.

谢谢!

推荐答案

看看?glmnet(

现在不清楚吗?如果将"a""b"作为因子水平,则"a"编码为0,而"b"编码为1.

Isn't it clear now? If you have "a" and "b" as your factor levels, "a" is coded as 0, while "b" is coded 1.

这种治疗确实是标准的.它与R代码如何自动分解或您自己如何编码这些因子水平有关.看:

Such treatment is really standard. It is related to how R codes factor automatically, or how you code these factor levels yourself. Look at:

## automatic coding by R based on alphabetical order
set.seed(0); y1 <- factor(sample(letters[1:2], 10, replace = TRUE))
## manual coding
set.seed(0); y2 <- factor(sample(letters[1:2], 10, replace = TRUE),
                   levels = c("b", "a"))

# > y1
# [1] b a a b b a b b b b
# Levels: a b
# > y2
# [1] b a a b b a b b b b
# Levels: b a

# > levels(y1)
# [1] "a" "b"
# > levels(y2)
# [1] "b" "a"

无论您使用的是glmnet()还是简单的glm(),都会发生相同的事情.

Whether you use glmnet(), or simply glm(), the same thing happens.

这篇关于glmnet:如何知道我的回答的哪个因子水平在逻辑回归中被编码为1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆