从多个时区读取R中的时间戳数据 [英] Reading timestamp data in R from multiple time zones
问题描述
我有一列字符格式的时间戳,如下所示:
2015-09-24 06:00:00 UTC
2015-09-24 05:00:00 UTC
dateTimeZone< - c 2015-09-24 06:00:00 UTC,2015-09-24 05:00:00 UTC)
我想使用POSIXct将这个字符数据转换成时间数据,如果我知道所有的时间戳都是UTC,我会这样做:
dateTimeZone< - asPOSIXct(dateTimeZone,tz =UTC)
但是,我不一定知道所有的时间戳都是UTC,所以我试过
dateTimeZone< - asPOSIXct(dateTimeZodateTimeZone,format =%Y-%m-%d%H:%M:%S%Z)
但是,由于strptime仅支持%Z仅用于输出,则会返回以下错误:
strptime中的错误(x,format,tz = tz):
使用%Z进行输入不支持ed
我检查了lubridate包的文档,我看不到它处理这个问题与POSIXct不同。 / p>
我唯一的选择是检查每一行的时区,然后使用如下所示的适当时区?
temp [grepl(UTC,datetimezone)]< - as.POSIXct(datetimezone,tz =UTC)
temp [grepl(PDT ,datetimezone)]< - as.POSIXct(datetimezone,tz =America / Los_Angeles)
您可以通过检查每一行并相应处理,然后将所有内容恢复到一致的UTC时间。 (#编辑为现在包括将时区缩写与全时区规范进行匹配)
日期< - c(
2015-09-24 06:00:00 UTC,
2015-09-24 05:00:00 PDT
)
#extract时区从日期
datestz< - vapply(strsplit(dates,),tail,1,FUN.VALUE =)
##将缩写的主列表设置为
# #全时区名称。使用任意夏天
##和冬季的日期来尝试赶上夏令时间区。
tzabbrev< - vapply(
OlsonNames(),
函数(x)c(
格式(as.POSIXct(2000-01-01 tz = x),%Z),
格式(as.POSIXct(2000-07-01,tz = x),%Z)
),
FUN .VALUE = character(2)
)
tmp< - data.frame(Olson = OlsonNames(),t(tzabbrev),stringsAsFactors = FALSE)
final< - unique .frame(tmp [1],abbrev = unlist(tmp [-1])))
##做匹配:
out< - Map(as.POSIXct, tz = final $ Olson [match(datestz,final $ abbrev)])
as.POSIXct(unlist(out),origin =1970-01-01,tz =UTC)
# 2015-09-24 06:00:00 UTC 2015-09-24 05:00:00 PDT
#2015-09-24 06:00:00 GMT2015-09-24 12:00: 00 GMT
I have a column of time stamps in character format that looks like this:
2015-09-24 06:00:00 UTC
2015-09-24 05:00:00 UTC
dateTimeZone <- c("2015-09-24 06:00:00 UTC","2015-09-24 05:00:00 UTC")
I'd like to convert this character data into time data using POSIXct, and if I knew that all the time stamps were in UTC, I would do it like this:
dateTimeZone <- asPOSIXct(dateTimeZone, tz="UTC")
However, I don't necessarily know that all the time stamps are in UTC, so I tried
dateTimeZone <- asPOSIXct(dateTimeZodateTimeZone, format = "%Y-%m-%d %H:%M:%S %Z")
However, because strptime supports %Z only for output, this returns the following error:
Error in strptime(x, format, tz = tz) : use of %Z for input is not supported
I checked the documentation for the lubridate package, and I couldn't see that it handled this issue any differently than POSIXct.
Is my only option to check the time zone of each row and then use the appropriate time zone with something like the following?
temp[grepl("UTC",datetimezone)] <- as.POSIXct(datetimezone, tz="UTC")
temp[grepl("PDT",datetimezone)] <- as.POSIXct(datetimezone, tz="America/Los_Angeles")
You can get there by checking each row and processing accordingly, and then putting everything back into a consistent UTC time. (#edited to now include matching the timezone abbreviations to the full timezone specification)
dates <- c(
"2015-09-24 06:00:00 UTC",
"2015-09-24 05:00:00 PDT"
)
#extract timezone from dates
datestz <- vapply(strsplit(dates," "), tail, 1, FUN.VALUE="")
## Make a master list of abbreviation to
## full timezone names. Used an arbitrary summer
## and winter date to try to catch daylight savings timezones.
tzabbrev <- vapply(
OlsonNames(),
function(x) c(
format(as.POSIXct("2000-01-01",tz=x),"%Z"),
format(as.POSIXct("2000-07-01",tz=x),"%Z")
),
FUN.VALUE=character(2)
)
tmp <- data.frame(Olson=OlsonNames(), t(tzabbrev), stringsAsFactors=FALSE)
final <- unique(data.frame(tmp[1], abbrev=unlist(tmp[-1])))
## Do the matching:
out <- Map(as.POSIXct, dates, tz=final$Olson[match(datestz,final$abbrev)])
as.POSIXct(unlist(out), origin="1970-01-01", tz="UTC")
# 2015-09-24 06:00:00 UTC 2015-09-24 05:00:00 PDT
#"2015-09-24 06:00:00 GMT" "2015-09-24 12:00:00 GMT"
这篇关于从多个时区读取R中的时间戳数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!