将多个类别变量重塑为二进制响应变量 [英] Reshape multiple categorical variables to binary response variables
本文介绍了将多个类别变量重塑为二进制响应变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试转换以下格式:
I am trying to convert the following format:
mydata <- data.frame(movie = c("Titanic", "Departed"),
actor1 = c("Leo", "Jack"),
actor2 = c("Kate", "Leo"))
movie actor1 actor2
1 Titanic Leo Kate
2 Departed Jack Leo
二进制响应变量:
movie Leo Kate Jack
1 Titanic 1 1 0
2 Departed 1 0 1
我尝试了中所述的解决方案 将行数据转换为二进制列 但我可以让它适用于两个变量,而不是三个.
I tried the solution described in Convert row data to binary columns but I could get it to work for two variables, not three.
如果有一种干净的方法,我将不胜感激.
I would really appreciate if there is a clean way to do this.
推荐答案
基于tidyr
的更新选项是转换为长形,使用complete
填充缺少的电影和演员组合,然后将逻辑is.na
测试转换为数值.然后重新变宽.
An updated tidyr
-based option is to convert to long-shape, use complete
to fill in missing combinations of movies and actors, and then just convert a logical is.na
test to a numeric value. Then reshape back to wide.
library(tidyr)
mydata %>%
pivot_longer(starts_with("actor"), names_to = "acted") %>%
complete(movie, value) %>%
dplyr::mutate(acted = as.numeric(!is.na(acted))) %>%
pivot_wider(names_from = value, values_from = acted)
#> # A tibble: 2 x 4
#> movie Jack Leo Kate
#> <fct> <dbl> <dbl> <dbl>
#> 1 Departed 1 1 0
#> 2 Titanic 0 1 1
这篇关于将多个类别变量重塑为二进制响应变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文