通过引用相似的列名将多列与 Tidyr 的 Unite 结合起来 [英] Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

查看:23
本文介绍了通过引用相似的列名将多列与 Tidyr 的 Unite 结合起来的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

library(tidyr)
library(dplyr)
library(tidyverse)

下面是一个简单数据框的代码.我有一些乱七八糟的数据,这些数据是用列因子类别导出的,这些数据分布在不同的列中.

Below is the code for a simple dataframe. I have some messy data that was exported with column factor categories spread out in different columns.

Client<-c("Client1","Client2","Client3","Client4","Client5")
Sex_M<-c("Male","NA","Male","NA","Male")
Sex_F<-c(" ","Female"," ","Female"," ")
Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied")
Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")
CommunicationType_Email<-c("Email"," "," ","Email","Email")
CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ")
DF<-tibble(Client,Sex_M,Sex_F,Satisfaction_Satisfied,Satisfaction_VerySatisfied,CommunicationType_Email,CommunicationType_Phone)

我想使用 tidyr 的联合"将类别重新组合成单​​列.

I want to recombine the categories into single columns using tidyr's "unite".

DF<-DF%>%unite(Sat,Satisfaction_Satisfied,Satisfaction_VerySatisfied,sep=" ")%>%
unite(Sex,Sex_M,Sex_F,sep=" ")

但是,我必须写多个联合"行,我觉得这违反了三倍规则,因此必须有一种方法可以使这更容易,尤其是因为我的真实数据包含需要合并的数十列.有没有办法使用联合"一次但以某种方式引用匹配的列名,以便所有相似的列名(例如,Sex_M"和Sex_F"包含Sex",CommunicationType_Email"包含CommunicationType")和CommunicationType_Phone")与上述公式相结合?

However, I have to write multiple "unite" lines and I feel this violates the three times rule, so there must be a way to make this easier, especially since my real data contains dozens of columns that need to be combined. Is there a way to use "unite" once but somehow refer to matching column names so that all column names that are similar (For example, containing "Sex" for "Sex_M" and "Sex_F", and "CommunicationType" for "CommunicationType_Email" and "CommunicationType_Phone") are combined with the above formula?

我也在考虑一个允许我输入列名的函数,但这对我来说太难了,因为它涉及复杂的标准评估.

I was also thinking about a function that allows me to enter column names, but this is too difficult for me since it involves complex standard evaluation.

推荐答案

我们可以使用unite

library(tidyverse)
DF %>% 
    unite(Sat, matches("^Sat"))

<小时>

对于多种情况,也许


For multiple cases, perhaps

gather(DF, Var, Val, -Client, na.rm = TRUE) %>%
        separate(Var, into = c("Var1", "Var2")) %>%
        group_by(Client, Var1) %>% 
        summarise(Val = paste(Val[!(is.na(Val)|Val=="")], collapse="_")) %>%
        spread(Var1, Val)
#  Client CommunicationType  Satisfaction    Sex
#*   <chr>             <chr>         <chr>  <chr>
#1 Client1             Email     Satisfied   Male
#2 Client2             Phone VerySatisfied Female
#3 Client3             Phone VerySatisfied   Male
#4 Client4             Email     Satisfied Female
#5 Client5             Email     Satisfied   Male

这篇关于通过引用相似的列名将多列与 Tidyr 的 Unite 结合起来的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆