从多个时区读取R中的时间戳数据 [英] Reading timestamp data in R from multiple time zones

查看:149
本文介绍了从多个时区读取R中的时间戳数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一列字符格式的时间戳,如下所示:



2015-09-24 06:00:00 UTC



2015-09-24 05:00:00 UTC

  dateTimeZone<  -  c 2015-09-24 06:00:00 UTC,2015-09-24 05:00:00 UTC)

我想使用POSIXct将这个字符数据转换成时间数据,如果我知道所有的时间戳都是UTC,我会这样做:

  dateTimeZone<  -  asPOSIXct(dateTimeZone,tz =UTC)

但是,我不一定知道所有的时间戳都是UTC,所以我试过

  dateTimeZone<  -  asPOSIXct(dateTimeZodateTimeZone,format =%Y-%m-%d%H:%M:%S%Z)

但是,由于strptime仅支持%Z仅用于输出,则会返回以下错误:


strptime中的错误(x,format,tz = tz):
使用%Z进行输入不支持ed


我检查了lubridate包的文档,我看不到它处理这个问题与POSIXct不同。 / p>

我唯一的选择是检查每一行的时区,然后使用如下所示的适当时区?

  temp [grepl(UTC,datetimezone)]<  -  as.POSIXct(datetimezone,tz =UTC)
temp [grepl(PDT ,datetimezone)]< - as.POSIXct(datetimezone,tz =America / Los_Angeles)


解决方案

您可以通过检查每一行并相应处理,然后将所有内容恢复到一致的UTC时间。 (#编辑为现在包括将时区缩写与全时区规范进行匹配)

 日期<  -  c(
2015-09-24 06:00:00 UTC,
2015-09-24 05:00:00 PDT


#extract时区从日期
datestz< - vapply(strsplit(dates,),tail,1,FUN.VALUE =)

##将缩写的主列表设置为
# #全时区名称。使用任意夏天
##和冬季的日期来尝试赶上夏令时间区。

tzabbrev< - vapply(
OlsonNames(),
函数(x)c(
格式(as.POSIXct(2000-01-01 tz = x),%Z),
格式(as.POSIXct(2000-07-01,tz = x),%Z)
),
FUN .VALUE = character(2)

tmp< - data.frame(Olson = OlsonNames(),t(tzabbrev),stringsAsFactors = FALSE)
final< - unique .frame(tmp [1],abbrev = unlist(tmp [-1])))

##做匹配:
out< - Map(as.POSIXct, tz = final $ Olson [match(datestz,final $ abbrev)])
as.POSIXct(unlist(out),origin =1970-01-01,tz =UTC)
# 2015-09-24 06:00:00 UTC 2015-09-24 05:00:00 PDT
#2015-09-24 06:00:00 GMT2015-09-24 12:00: 00 GMT


I have a column of time stamps in character format that looks like this:

2015-09-24 06:00:00 UTC

2015-09-24 05:00:00 UTC

dateTimeZone <- c("2015-09-24 06:00:00 UTC","2015-09-24 05:00:00 UTC")

I'd like to convert this character data into time data using POSIXct, and if I knew that all the time stamps were in UTC, I would do it like this:

dateTimeZone <- asPOSIXct(dateTimeZone, tz="UTC")

However, I don't necessarily know that all the time stamps are in UTC, so I tried

dateTimeZone <- asPOSIXct(dateTimeZodateTimeZone, format = "%Y-%m-%d %H:%M:%S %Z")

However, because strptime supports %Z only for output, this returns the following error:

Error in strptime(x, format, tz = tz) : use of %Z for input is not supported

I checked the documentation for the lubridate package, and I couldn't see that it handled this issue any differently than POSIXct.

Is my only option to check the time zone of each row and then use the appropriate time zone with something like the following?

temp[grepl("UTC",datetimezone)] <- as.POSIXct(datetimezone, tz="UTC")
temp[grepl("PDT",datetimezone)] <- as.POSIXct(datetimezone, tz="America/Los_Angeles")

解决方案

You can get there by checking each row and processing accordingly, and then putting everything back into a consistent UTC time. (#edited to now include matching the timezone abbreviations to the full timezone specification)

dates <- c(
  "2015-09-24 06:00:00 UTC",
  "2015-09-24 05:00:00 PDT"
)

#extract timezone from dates
datestz <- vapply(strsplit(dates," "), tail, 1, FUN.VALUE="")

## Make a master list of abbreviation to 
## full timezone names. Used an arbitrary summer
## and winter date to try to catch daylight savings timezones.

tzabbrev <- vapply(
  OlsonNames(),
  function(x) c(
    format(as.POSIXct("2000-01-01",tz=x),"%Z"),
    format(as.POSIXct("2000-07-01",tz=x),"%Z")
  ),
  FUN.VALUE=character(2)
)
tmp <- data.frame(Olson=OlsonNames(), t(tzabbrev), stringsAsFactors=FALSE)
final <- unique(data.frame(tmp[1], abbrev=unlist(tmp[-1])))

## Do the matching:
out <- Map(as.POSIXct, dates, tz=final$Olson[match(datestz,final$abbrev)])
as.POSIXct(unlist(out), origin="1970-01-01", tz="UTC")
#  2015-09-24 06:00:00 UTC   2015-09-24 05:00:00 PDT 
#"2015-09-24 06:00:00 GMT" "2015-09-24 12:00:00 GMT" 

这篇关于从多个时区读取R中的时间戳数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆