排序分类变量是否有优势? [英] Is there an advantage to ordering a categorical variable?

查看:104
本文介绍了排序分类变量是否有优势?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被告知最好在适当的地方对分类变量进行排序(例如,短小于小于中小于长).我想知道,在将分类变量建模为解释变量的情况下,将分类变量按顺序而不是简单分类进行处理有什么特殊优势?在数学上是什么意思(最好是外行!)?

I have been advised that it is best to order categorical variables where appropriate (e.g. short less than medium less than long). I am wondering, what is the specific advantage of treating a categorical variable as ordered as opposed to just simple categorical, in the context of modelling it as an explanatory variable? What does it mean mathematically (in lay terms preferably!)?

非常感谢!

推荐答案

在其他方面,它允许您比较这些因素的值:

Among other things, it allows you to compare values from those factors:

> ord.fac <- ordered(c("small", "medium", "large"), levels=c("small", "medium", "large"))
> fac <- factor(c("small", "medium", "large"), levels=c("small", "medium", "large"))
> ord.fac[[1]] < ord.fac[[2]]
[1] TRUE
> fac[[1]] < fac[[2]]
[1] NA
Warning message:
  In Ops.factor(fac[[1]], fac[[2]]) : < not meaningful for factors

文档表明,从建模的角度来看,这会产生很大的影响:

Documentation suggests there is quite an impact from a modeling perspective:

有序因子仅在其类中不同于因子,但是方法和模型拟合函数对这两个类的处理却大不相同

Ordered factors differ from factors only in their class, but methods and the model-fitting functions treat the two classes quite differently

但是我必须让熟悉这些用例的人提供有关细节.

but I'll have to let someone familiar with those use cases provide the details on that.

这篇关于排序分类变量是否有优势?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆