将字符列拆分为多个二进制 (0/1) 列 [英] Split character column into several binary (0/1) columns

查看:82
本文介绍了将字符列拆分为多个二进制 (0/1) 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的字符向量:

I have a character vector like this:

a <- c("a,b,c", "a,b", "a,b,c,d")

我想做的是创建一个数据框,其中每个字符串中的单个字母由虚拟列表示:

What I would like to do is create a data frame where the individual letters in each string are represented by dummy columns:

   a    b    c    d
1] 1    1    1    0
2] 1    1    0    0
3] 1    1    1    1

我有一种感觉,我需要使用 read.tablereshape 的某种组合,但我真的很挣扎.任何和帮助表示赞赏.

I have a feeling that I need to be using some combination of read.table and reshape but am really struggling. Any and help appreciated.

推荐答案

你可以试试我的splitstackshape"包中的 cSplit_e :

You can try cSplit_e from my "splitstackshape" package:

library(splitstackshape)
a <- c("a,b,c", "a,b", "a,b,c,d")
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0)
#          a a_a a_b a_c a_d
# 1:   a,b,c   1   1   1   0
# 2:     a,b   1   1   0   0
# 3: a,b,c,d   1   1   1   1
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0, drop = TRUE)
#    a_a a_b a_c a_d
# 1:   1   1   1   0
# 2:   1   1   0   0
# 3:   1   1   1   1

<小时>

还有来自qdapTools"的mtabulate:

library(qdapTools)
mtabulate(strsplit(a, ","))
#   a b c d
# 1 1 1 1 0
# 2 1 1 0 0
# 3 1 1 1 1

<小时>

一个非常直接的基本 R 方法是将 tablestackstrsplit 一起使用:


A very direct base R approach is to use table along with stack and strsplit:

table(rev(stack(setNames(strsplit(a, ",", TRUE), seq_along(a)))))
#    values
# ind a b c d
#   1 1 1 1 0
#   2 1 1 0 0
#   3 1 1 1 1

这篇关于将字符列拆分为多个二进制 (0/1) 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆