在有序的日期列表中填写缺少的年份 [英] Fill in missing year in ordered list of dates

查看:76
本文介绍了在有序的日期列表中填写缺少的年份的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经从网上收集了一些时间序列数据,时间戳如下所示。

I have collected some time series data from the web and the timestamp that I got looks like below.

24 Jun 
21 Mar
20 Jan 
10 Dec
20 Jun 
20 Jan
10 Dec 
...

有趣的是部分是数据中缺少年份,但是,所有记录都是有序的,您可以从记录中推断年份并填写缺失的数据。因此,估算后的数据应如下所示:

The interesting part is that the year is missing in the data, however, all the records are ordered, and you can infer the year from the record and fill in the missing data. So the data after imputing should be like this:

24 Jun 2014
21 Mar 2014
20 Jan 2014
10 Dec 2013 
20 Jun 2013
20 Jan 2013
10 Dec 2012
...

在松开袖子开始写 for 循环并嵌套 逻辑..有一种简单的方法可以在R中开箱即用以估算缺少的年份。

Before lifting my sleeves and start writing a for loop with nested logic.. is there a easy way that might work out of box in R to impute the missing year.

非常感谢任何建议!

推荐答案

这是一个主意

## Make data easily reproducible
df <- data.frame(day=c(24, 21, 20, 10, 20, 20, 10),
                 month = c("Jun", "Mar", "Jan", "Dec", "Jun", "Jan", "Dec"))


## Convert each month-day combo to its corresponding "julian date"
datestring <- paste("2012", match(df[[2]], month.abb), df[[1]], sep = "-")
date <- strptime(datestring, format = "%Y-%m-%d") 
julian <- as.integer(strftime(date, format = "%j"))

## Transitions between years occur wherever julian date increases between
## two observations
df$year <- 2014 - cumsum(diff(c(julian[1], julian))>0)

## Check that it worked
df
#   day month year
# 1  24   Jun 2014
# 2  21   Mar 2014
# 3  20   Jan 2014
# 4  10   Dec 2013
# 5  20   Jun 2013
# 6  20   Jan 2013
# 7  10   Dec 2012

这篇关于在有序的日期列表中填写缺少的年份的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆