为什么我的回归摘要中会丢失分类数据? [英] Why am I losing categorical data in my regression summary?

查看:51
本文介绍了为什么我的回归摘要中会丢失分类数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

box <- read.csv("BlackBoxtrainApril22.csv")

#Change the 2 categorical variables into factors
box$SOUND <- as.factor(box$SOUND)
box$SWITCH <- as.factor(box$SWITCH)

#divide training and testing data
train <- box[1:12000,]
test <- box[12001:18048,]

library(nnet)
require(nnet)
multinom_model <- multinom(SOUND ~ ., data=box)
summary(multinom_model)

下面是 dput(head(box)) 的一些输出,看看数据是什么样的:

Here's some output from dput(head(box)) to see what the data looks like:

structure(list(ID = c(86623L, 57936L, 54301L, 2678L, 65827L, 22420L), INPUT1 = c(30L, 87L, 16L, 64L, 33L, 5L), INPUT2 = c(31L, 76L, 33L, 77L, 72L, 50L), INPUT3 = c(72L, 31L, 87L, 91L, 53L, 26L), INPUT4 = c(29L, 79L, 41L, 59L, 66L, 50L), SWITCH = c("Low", "Low", "Low", "Minimum", "High", "High"), SOUND = c("Gargle", "Tick", "Tick", "Beep", "Beep", "Gargle")), row.names = c(NA, 6L), class = "data.frame")

本质上,我正在尝试使用数字和分类数据的组合来预测分类变量.这是我的代码.当我做总结时,我丢失了 SWITCH 类别之一和 SOUND 类别之一.我认为这与引用变量有关,但我不确定.

In essence, I'm trying to predict a categorical variable using a combination of numeric and categorical data. This is my code. When I do a summary, I lose one of the SWITCH categories and one of SOUND categories. I think it has something to do with reference variables, but I'm not exactly sure.

推荐答案

您对参考类别的看法是正确的.当您在模型中包含分类/因子变量时,始终排除该变量的一个类别并用作参考类别.您在输出中看到的类别估计值是参考被排除的类别..例如,如果您有一个类别为红色"、蓝色"和绿色"以及红色"的因子变量,是参考类别,然后是蓝色"的模型估计值.和绿色"将用于蓝色"与红色"和绿色"分别为红色".

You're right about the reference categories. When you include a categorical/factor variable in a model, one category of the variable is always excluded and serves as the reference category. The estimates for the categories that you do see in the output are in reference to the category that was excluded. For example, if you have a factor variable with categories "red", "blue", and "green", and "red" is the reference category, then the model estimates for "blue" and "green" will be for "blue" vs "red" and "green" vs "red", respectively.

这篇关于为什么我的回归摘要中会丢失分类数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆