如何在 R tidyverse 中转换列类型 [英] How to convert column types in R tidyverse

查看:26
本文介绍了如何在 R tidyverse 中转换列类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力适应使用 Tidyverse,但数据类型转换被证明是一个障碍.我知道自动将字符串转换为因子并不理想,但有时我想使用因子,因此一些将小标题中所需字符列轻松转换为因子的方法会非常好.我更喜欢使用 readxl 包读取 excel 文件,但因素不是允许的列类型!事后我可以逐列浏览,但这确实效率不高.我希望以下两项中的任何一项都能起作用:

I'm trying to get comfortable with using the Tidyverse, but data type conversions are proving to be a barrier. I understand that automatically converting strings to factors is not ideal, but sometimes I would like to use factors, so some approach to easily converting desired character columns in a tibble to factors would be excellent. I prefer to read in excel files with the readxl package, but factors aren't a permitted column type! I can go through column by column after the fact, but that's really not efficient. I want either of these two following things to work:

  1. 读入一个文件,同时指定哪些列应该被读为因子:

  1. Read in a file and simultaneously specify which columns should be read as factors:

 data <- read_excel(path = "myfile.xlsx", 
                    col_types=c(col2="factor", col5="factor)))

  • 或者这个功能会非常好,原因有很多,但我不知道它应该如何工作.col_types 函数让我很困惑:

  • Or this function would be excellent for many reasons, but I can't figure out how it's supposed to work. The col_types function is very confusing to me:

     diamonds <- col_types(diamonds, 
                           cols=c(cut="factor", color="factor", clarity="factor"))
    

  • 提前致谢!

    推荐答案

    read_excel 使用 Excel 单元格类型来猜测 R 中使用的列类型.我也同意 read_excel 应该读取数据并允许一组有限的列类型.然后,如果用户愿意,可以稍后进行类型转换.

    read_excel uses Excel cell types to guess column types for use in R. I also agree with the opinion of read_excel that one should read the data and allow a limited set of column types. Then if the user wishes, type conversion can take place later.

    没有名为 col_types 的函数.这是read_excel 的参数名称.tidyverse 的方式是:

    There is no function called col_types. That is a parameter name for read_excel. The tidyverse way would be:

    library(tidyverse)
    
    (foo <- data_frame(x = letters[1:3], y = LETTERS[4:6], z=1:3))
    #> # A tibble: 3 x 3
    #>   x     y         z
    #>   <chr> <chr> <int>
    #> 1 a     D         1
    #> 2 b     E         2
    #> 3 c     F         3
    
    foo %>% 
      mutate_at(vars(x, y), factor)
    #> # A tibble: 3 x 3
    #>   x     y         z
    #>   <fct> <fct> <int>
    #> 1 a     D         1
    #> 2 b     E         2
    #> 3 c     F         3
    

    这篇关于如何在 R tidyverse 中转换列类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆