将多个二进制列转换为单个分类列 [英] Convert multiple binary columns to single categorical column

查看:103
本文介绍了将多个二进制列转换为单个分类列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个充满二进制变量的表,我想将其简化为分类变量。

I have a table full of binary variables that I would like to condense down to categorical variables.

非常简单,我有一个像这样的数据框:

Very simplistically, I have is a data frame like this:

data <- data.frame(id=c(1,2,3,4,5,6,7,8,9), red=c("1","0","0","0","1","0","0","0","0"),blue=c("0","1","1","1","0","1","1","1","0"),yellow=c("0","0","0","0","0","0","0","0","1"))
data
  id   red   blue  yellow
1  1   1    0      0
2  2   0    1      0
3  3   0    1      0
4  4   0    1      0
5  5   1    0      0
6  6   0    1      0
7  7   0    1      0
8  8   0    1      0
9  9   0    0      1

我想得到的是:

  id   color 
1  1   red    
2  2   blue   
3  3   blue    
4  4   blue    
5  5   red    
6  6   blue    
7  7   blue    
8  8   blue    
9  9   yellow 

I希望对此有一个非常简单的答案。

I hope there's a really simple answer for this.

推荐答案

您可以通过使用名称和 as.ologic 。但是,由于二进制列是要考虑的因素,因此您需要多做一些准备工作。

You can get the values by making use of the column names and as.logical. However, since your "binary" columns are factors, you need to go though a few more hoops:

> apply(data[-1], 1, function(x) names(x)[as.logical(as.numeric(as.character(x)))])
[1] "red"    "blue"   "blue"   "blue"   "red"    "blue"   "blue"   "blue"   "yellow"

将此绑定回第一列( data [1] )以获取所需的输出。

Bind this back with the first column (data[1]) to get the output you want.

cbind(data[1], 
      color = apply(data[-1], 1, 
                    function(x) names(x)[as.logical(as.numeric(
                      as.character(x)))]))
#   id  color
# 1  1    red
# 2  2   blue
# 3  3   blue
# 4  4   blue
# 5  5    red
# 6  6   blue
# 7  7   blue
# 8  8   blue
# 9  9 yellow

或者,您可以尝试以下操作:

Alternatively, you can try the following:

data[-1] <- lapply(data[-1], function(x) as.numeric(as.character(x)))
temp <- subset(cbind(data[1], stack(data[-1])), values == 1, c("id", "ind"))
temp[order(temp$id), ]

或者,您可以组合使用 dplyr和 tidyr,例如:

Or, you can use a combination of "dplyr" and "tidyr", like this:

library(dplyr)
library(tidyr)

data %>%
  group_by(id) %>%
  mutate_each(funs(an = as.numeric(as.character(.)))) %>%
  gather(color, val, -id) %>%
  filter(val == 1) %>%
  select(-val) %>%
  arrange(id)
# Source: local data frame [9 x 2]
# 
#   id  color
# 1  1    red
# 2  2   blue
# 3  3   blue
# 4  4   blue
# 5  5    red
# 6  6   blue
# 7  7   blue
# 8  8   blue
# 9  9 yellow

这篇关于将多个二进制列转换为单个分类列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆