将因子转换为不带NA的日期对象R [英] Convert factor to date object R without NA

查看:59
本文介绍了将因子转换为不带NA的日期对象R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:如何将因子转换为 date 对象 NA 值。

Question: how can I convert a factor to a date object without getting NA values.

这是类似的帖子:将系数转换为R中的日期/时间

在该帖子中,用户转换为字符 日期之前的对象。使用 as.character character 对象时,我得到 NA as.Date 函数中的$ c>。

In that post, the user converted to a character object before a date. I am getting NA values when converting to character object using as.character inside the as.Date function.

我在数据框中有一列,其日期为因子格式,出现次数不同。这是data.frame中包含的信息。

I have a column in the dataframe with the date in factor format with different numbers of occurrences. Here's the information contained in the data.frame.

> head(fraud, 5)
  TRANSACTION.DATE TRANSACTION.AMOUNT AIR.TRAVEL.DATE POSTING.DATE
1 2/27/14                  25.00                 <NA>          2/28/14
2 2/28/14                  25.00                 <NA>          2/28/14
3 2/27/14                  25.00                 <NA>          2/28/14
4 2/27/14                  20.00              2/27/14          2/28/14
5 2/27/14                  12.13                 <NA>          2/28/14

> str(fraud$TRANSACTION.DATE)
 Factor w/ 519 levels "1/1/14","1/1/15",..: 228 230 228 228 228 230 226 228 230 228 ...

> summary(fraud$TRANSACTION.DATE, 5)
9/30/14 9/17/14 11/4/14 9/23/14 (Other) 
    197     187     171     160   19221 

将因子转换为 date 对象会导致 NA 个值。

Converting the factor to a date object resulted in NA values.

> fraud$TRANSACTION.DATE <- as.Date(as.character(fraud$TRANSACTION.DATE), 
+                                       format = "%m/%d/%Y")
> head(fraud$TRANSACTION.DATE, 5)
[1] NA NA NA NA NA

检查 as.character 函数是否起作用。

> fraud$TRANSACTION.DATE <- as.character(fraud$TRANSACTION.DATE)
> head(fraud$TRANSACTION.DATE)
[1] NA NA NA NA NA NA

编辑:我使用as.Date函数,但格式错误

> fraud$TRANSACTION.DATE <- as.Date(fraud$TRANSACTION.DATE, format = "%m/%d/%Y")
> str(fraud$TRANSACTION.DATE)
 Date[1:19936], format: "0014-02-27" "0014-02-28" "0014-02-27" "0014-02-27" "0014-02-27" ...
> head(fraud$TRANSACTION.DATE, 5)
[1] "0014-02-27" "0014-02-28" "0014-02-27" "0014-02-27" "0014-02-27"

编辑2:这是dput值

> dput(droplevels(head(fraud$TRANSACTION.DATE)))
structure(c(1L, 2L, 1L, 1L, 1L, 2L), .Label = c("2/27/14", "2/28/14"
), class = "factor")

解决方案:使用%y代替%Y

> fraud$TRANSACTION.DATE <- as.Date(fraud$TRANSACTION.DATE, "%m/%d/%y")
> head(fraud$TRANSACTION.DATE, 5)
[1] "2014-02-27" "2014-02-28" "2014-02-27" "2014-02-27" "2014-02-27"


推荐答案

现在的问题是格式字符串指出日期包括年份 with Century ,其中您的日期只包含年份没有世纪。您需要使用%y 占位符,而不是%Y 占位符。

The problem now is that your format string states the dates include the year with century where your dates only contain the year without century. You need to use the %y placeholder, not the %Y one.

dates <- factor(c("2/27/14","2/28/14","2/27/14","2/27/14","2/27/14"))
as.Date(dates, format = "%m/%d/%y") # correct lowercase y
as.Date(dates, format = "%m/%d/%Y") # incorrect uppercase y

> as.Date(dates, format = "%m/%d/%y")
[1] "2014-02-27" "2014-02-28" "2014-02-27" "2014-02-27" "2014-02-27"
> as.Date(dates, format = "%m/%d/%Y")
[1] "14-02-27" "14-02-28" "14-02-27" "14-02-27" "14-02-27"

当您使用正确的格式时,注意R会正确显示占位符小写的 y

Notice R gets it right when you use the correct placeholder; lowercase y.

在没有$ %Y 的情况下会发生什么百年似乎与操作系统有关。正如您在Linux(Fedora 22)上所看到的那样,我没有获得年度填充部分,而您却看到了零填充。

What happens with %Y when you don't have a year with century seems OS dependent. As you can see on Linux (Fedora 22) I get no padding of the year part whereas you are seeing zero-padding.

这篇关于将因子转换为不带NA的日期对象R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆