将数据框列表的列转换为因数 [英] converting columns of list of data frame to factor

查看:170
本文介绍了将数据框列表的列转换为因数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在手动给数据框添加标签,如下所示,我要标记800列,然后创建数据框的子集(数据的子设置有很多),然后将该数据框应用于功能

Hi I am giving labels to my data frame manually like below, I have 800 columns to be labeled , after that I am creating a subset of data frame (sub setting of data have many), then applying that data frame to function for calculation.

标签对于所有块而言可能是不同的,这也是为所有块一个一个地创建标签所花费的时间。

labels can be different for all chunks , also its very time taking for creating labels one by one for all chunks.

data<-data.frame( col1=c(1,1,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,1,1,1,NA,1,1,NA,NA,NA,NA,1,NA,NA,NA,NA,1,NA,1),
                  col2=c(1,1,1,1,1,NA,NA,NA,NA,1,1,1,1,1,NA,NA,NA,1,1,1,NA,1,1,1,1,1,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA),
                  col3=c(1,1,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,1,1,1,NA,NA,NA,1,NA,NA,1,1,1,1,1,NA,NA,1),
                  col4=c(1,NA,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
                  col5=c(1,2,1,1,1,2,1,2,2,1,2,NA,1,1,2,2,2,1,1,1,2,NA,2,1,1,1,2,2,2,NA,1,2,2,1,1,1,2,2,2)
)  

data$col5<-factor(data$col5, levels=c(1,2), labels=c("Local","Overseas"))

df<- data
df$cc1<-1
df2<- subset(df, col5 == 'Local')
df$cc2<-ifelse(df$col5 == 'Local',1,NA)
lst<-list(df$cc1, df$cc2)
ldat<-list("ALL" = df, "Local" =df2)

col_names <- c("col1","col2"...."col4")
    labels <- c("Sales","Ops"...."HR")

make_mutator <- function(x) {
  paste0(
    "factor(", names(faclist)[[x]],
    ",labels=c('",
    paste0(faclist[[x]],
           collapse = "','"
    ), "'))"
  )
}


list_of_fac <- purrr::map_chr(seq_len(length(faclist)),
                              make_mutator)

names(list_of_fac) <- names(faclist)

ldat <- purrr::map(ldat,
                   ~mutate(.,
                           !!!parse_exprs(list_of_fac)))

这很好并且为我工作....但是,如果我将为列和标签分别给列和标签,例如

This is perfectly fine and working for me ....but just want new solution if i will give columns and labels separately for columns and labels like

col_names<-c( col1, col2 ; .... col4&qu ot;)
标签<-c(销售, Ops .... HR)

col_names <- c("col1","col2"...."col4") labels <- c("Sales","Ops"...."HR")

然后我该如何更改我的功能这.... ??

then how can i change my function for this....??

推荐答案

代替解析,一个更简单的选择是使用 map2 遍历列表 map 之后。使用 map2 ,我们传递感兴趣的列和基于命名的 list 'faclist'

Instead of the parsing, an easier option is to use map2 after looping over the list with map. With map2, we pass the columns of interest and the labels to be changed based on the named list 'faclist'

library(dplyr)
library(purrr)
ldat1 <- map(ldat, ~  {
     .x[names(faclist)] <- map2(.x %>% 
                             dplyr::select(names(faclist)), 
                         faclist, ~ factor(.x, labels= .y))
       .x} )

-输出

str(ldat1[[1]])
#'data.frame':  39 obs. of  7 variables:
# $ col1: Factor w/ 1 level "Sales": 1 1 NA NA NA NA NA NA 1 NA ...
# $ col2: Factor w/ 1 level "OPS": 1 1 1 1 1 NA NA NA NA 1 ...
# $ col3: Factor w/ 1 level "Management": 1 1 NA NA NA NA NA 1 NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA NA ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 2 1 1 1 2 1 2 2 1 ...
# $ cc1 : num  1 1 1 1 1 1 1 1 1 1 ...
# $ cc2 : num  1 NA 1 1 1 NA 1 NA NA 1 ...
str(ldat1[[2]])
#'data.frame':  18 obs. of  6 variables:
# $ col1: Factor w/ 1 level "Sales": 1 NA NA NA NA NA NA NA 1 NA ...
#$ col2: Factor w/ 1 level "OPS": 1 1 1 1 NA 1 1 1 1 1 ...
# $ col3: Factor w/ 1 level "Management": 1 NA NA NA NA NA NA NA NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA 1 ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 1 1 1 1 1 1 1 1 1 ...
# $ cc1 : num  1 1 1 1 1 1 1 1 1 1 ...




如果它不是列表,而是两个向量,则只需更改<$ c带有 col_names矢量的$ c> names(faclist)和带有 labels list 'faclist'向量


If it is not a list, but two vectors, then just change the names(faclist) with the 'col_names' vector and the list 'faclist' with labels vector

ldat1 <- map(ldat, ~  {
     .x[col_names] <- map2(.x %>% 
                             dplyr::select(col_names), 
                         labels, ~ factor(.x, labels= .y))
       .x} )

这篇关于将数据框列表的列转换为因数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆