一次转换数据框多列的类型 [英] Convert type of multiple columns of a dataframe at once

查看:16
本文介绍了一次转换数据框多列的类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎花了很多时间从文件、数据库或其他东西创建数据框,然后将每一列转换为我想要的类型(数字、因子、字符等).有没有办法一步一步做到这一点,可能是通过提供一个类型的向量?

I seem to spend a lot of time creating a dataframe from a file, database or something, and then converting each column into the type I wanted it in (numeric, factor, character etc). Is there a way to do this in one step, possibly by giving a vector of types ?

foo<-data.frame(x=c(1:10), 
                y=c("red", "red", "red", "blue", "blue", 
                    "blue", "yellow", "yellow", "yellow", 
                    "green"),
                z=Sys.Date()+c(1:10))

foo$x<-as.character(foo$x)
foo$y<-as.character(foo$y)
foo$z<-as.numeric(foo$z)

而不是最后三个命令,我想做类似的事情

instead of the last three commands, I'd like to do something like

foo<-convert.magic(foo, c(character, character, numeric))

推荐答案

编辑参见 this 有关此基本思想的一些简化和扩展的相关问题.

Edit See this related question for some simplifications and extensions on this basic idea.

我使用 switch 对 Brandon 的回答的评论:

My comment to Brandon's answer using switch:

convert.magic <- function(obj,types){
    for (i in 1:length(obj)){
        FUN <- switch(types[i],character = as.character, 
                                   numeric = as.numeric, 
                                   factor = as.factor)
        obj[,i] <- FUN(obj[,i])
    }
    obj
}

out <- convert.magic(foo,c('character','character','numeric'))
> str(out)
'data.frame':   10 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "4" ...
 $ y: chr  "red" "red" "red" "blue" ...
 $ z: num  15254 15255 15256 15257 15258 ...

对于真正的大数据帧,您可能希望使用 lapply 而不是 for 循环:

For truly large data frames you may want to use lapply instead of the for loop:

convert.magic1 <- function(obj,types){
    out <- lapply(1:length(obj),FUN = function(i){FUN1 <- switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor); FUN1(obj[,i])})
    names(out) <- colnames(obj)
    as.data.frame(out,stringsAsFactors = FALSE)
}

执行此操作时,请注意在 R 中强制数据的一些复杂性.例如,从因子转换为数字通常涉及 as.numeric(as.character(...)).另外,请注意 data.frame()as.data.frame() 将字符转换为因子的默认行为.

When doing this, be aware of some of the intricacies of coercing data in R. For example, converting from factor to numeric often involves as.numeric(as.character(...)). Also, be aware of data.frame() and as.data.frame()s default behavior of converting character to factor.

这篇关于一次转换数据框多列的类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆