在 read.table/read.csv 中为 colClasses 参数指定自定义日期格式 [英] Specify custom Date format for colClasses argument in read.table/read.csv

查看:23
本文介绍了在 read.table/read.csv 中为 colClasses 参数指定自定义日期格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 read.table/read.csv 中使用 colClasses 参数时,有没有办法指定日期格式?

(我意识到我可以在导入后进行转换,但是有很多这样的日期列,在导入步骤中会更容易)

(I realise I can convert after importing, but with many date columns like this, it would be easier to do it in the import step)

我有一个带有 %d/%m/%Y 格式的日期列的 .csv.

I have a .csv with date columns in the format %d/%m/%Y.

dataImport <- read.csv("data.csv", colClasses = c("factor","factor","Date"))

这会导致转换错误.例如,15/07/2008 变为 0015-07-20.

This gets the conversion wrong. For example, 15/07/2008 becomes 0015-07-20.

data <- 
structure(list(func_loc = structure(c(1L, 2L, 3L, 3L, 3L, 3L, 
3L, 4L, 4L, 5L), .Label = c("3076WAG0003", "3076WAG0004", "3076WAG0007", 
"3076WAG0009", "3076WAG0010"), class = "factor"), order_type = structure(c(3L, 
3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 1L), .Label = c("PM01", "PM02", 
"PM03"), class = "factor"), actual_finish = structure(c(4L, 6L, 
1L, 2L, 3L, 7L, 1L, 8L, 1L, 5L), .Label = c("", "11/03/2008", 
"14/08/2008", "15/07/2008", "17/03/2008", "19/01/2009", "22/09/2008", 
"6/09/2007"), class = "factor")), .Names = c("func_loc", "order_type", 
"actual_finish"), row.names = c(NA, 10L), class = "data.frame")


write.csv(data,"data.csv", row.names = F)                                                        

dataImport <- read.csv("data.csv")
str(dataImport)
dataImport

dataImport <- read.csv("data.csv", colClasses = c("factor","factor","Date"))
str(dataImport)
dataImport

这是输出的样子:

推荐答案

您可以编写自己的函数来接受字符串并使用您想要的格式将其转换为日期,然后使用 setAs将其设置为 as 方法.然后你可以使用你的函数作为 colClasses 的一部分.

You can write your own function that accepts a string and converts it to a Date using the format you want, then use the setAs to set it as an as method. Then you can use your function as part of the colClasses.

试试:

setAs("character","myDate", function(from) as.Date(from, format="%d/%m/%Y") )

tmp <- c("1, 15/08/2008", "2, 23/05/2010")
con <- textConnection(tmp)

tmp2 <- read.csv(con, colClasses=c('numeric','myDate'), header=FALSE)
str(tmp2)

然后根据需要进行修改以适用于您的数据.

Then modify if needed to work for your data.

编辑---

您可能希望先运行 setClass('myDate') 以避免出现警告(您可以忽略该警告,但如果您经常这样做会很烦人,这是一个简单的调用)摆脱它).

You might want to run setClass('myDate') first to avoid the warning (you can ignore the warning, but it can get annoying if you do this a lot and this is a simple call that gets rid of it).

这篇关于在 read.table/read.csv 中为 colClasses 参数指定自定义日期格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆