将虚拟变量重新编码为有序因子 [英] Recoding dummy variable to ordered factor

查看:43
本文介绍了将虚拟变量重新编码为有序因子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一些有关逻辑回归编码因子的帮助.

I need some help with coding factors for a logistic regression.

我有六个代表收入等级的虚拟变量.我想将这些转换为单个有序因子以用于逻辑回归.

What I have are six dummy variables representing income brackets. I want to convert these into a single ordered factor for use in a logistic regression.

我的数据框看起来像:

    INC1 INC2 INC3 INC4 INC5 INC6
1      0    0    1    0    0    0  
2     NA   NA   NA   NA   NA   NA  
3      0    0    0    0    0    1  
4      0    0    0    0    0    1  
5      0    0    1    0    0    0  
6      0    0    0    1    0    0  
7      0    0    1    0    0    0  
8      0    0    0    1    0    0

我想要的样子:

    INC
1   INC3  
2   NA   
3   INC6  
4   INC6  
5   INC3 
6   INC4  
7   INC3  
8   INC4   

这一定是一个常见(且简单)的操作,但我的搜索没有找到关于如何执行此重新编码的简明答案.很感谢任何形式的帮助.

This must be a common (and simple) operation, but my searches have not turned up a concise answer for how to perform this re-coding. Any help is very much appreciated.

推荐答案

这是基于另一个答案的解决方案,该答案保留 NA 值并转换为有序因子.

Here's a solution based on another answer that keeps the NA values and converts to an ordered factor.

> inc
  INC1 INC2 INC3 INC4 INC5 INC6
1    0    0    1    0    0    0
2   NA   NA   NA   NA   NA   NA
3    0    0    0    0    0    1
4    0    0    0    0    0    1
5    0    0    1    0    0    0
6    0    0    0    1    0    0
7    0    0    1    0    0    0
8    0    0    0    1    0    0
> inc$F = factor(apply(inc, 1, function(x) names(x)[x == 1]),levels=names(inc),ordered=TRUE)

> inc
  INC1 INC2 INC3 INC4 INC5 INC6    F
1    0    0    1    0    0    0 INC3
2   NA   NA   NA   NA   NA   NA <NA>
3    0    0    0    0    0    1 INC6
4    0    0    0    0    0    1 INC6
5    0    0    1    0    0    0 INC3
6    0    0    0    1    0    0 INC4
7    0    0    1    0    0    0 INC3
8    0    0    0    1    0    0 INC4
> inc$F
[1] INC3 <NA> INC6 INC6 INC3 INC4 INC3 INC4
Levels: INC1 < INC2 < INC3 < INC4 < INC5 < INC6

如果您连续有多个 1,这将中断.

This will break if you have more than one 1 in a row.

这篇关于将虚拟变量重新编码为有序因子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆