通过参考"lubridate"将UTC日期/时间调整为不同的时区. [英] Adjusting UTC date/time to different time zones by reference in `lubridate`
问题描述
我有一个data.table
,其中的UTC日期时间戳记录跨越多个时区,并且我想创建一个新列,该列将显示日期时间戳但在每个观测值的特定时区中,由同一表中的变量指定:
I have a data.table
with UTC date-time-stamp records spanning multiple time zones, and I want to create a new column that will show the date-time-stamp but in the specific time zone of each observation, which is specified by a variable in the same table:
require("lubridate")
require("data.table")
dt <- data.table(A = 1:5, B = rep(ymd_hms("2016-03-24 17:15:12", tz = "UTC"), 5), timezone = c("America/Indiana/Vincennes", "Australia/North", "Pacific/Palau", "Antarctica/Macquarie", "Asia/Nicosia"))
我试图通过以下方法完成此操作,但是它似乎无法起作用:
I was trying to accomplish this with the following, but it does not seem to function:
dt[, B_local := with_tz(B, tz = timezone)]
dt
Error in as.POSIXlt.POSIXct(x, tz) : invalid 'tz' value
当我尝试在命令中添加by
规范时,它越来越接近所需的输出,但是不正确,并且我认为某种程度上是由于日期/时间和时区对的不唯一性造成的,例如此示例表,即:
When I try to add a by
specification in the command, it gets closer to the desired output, but is incorrect and I think somehow is due to non-unique pairs of date-time and timezone like this sample table, ie:
dt[, B_local := with_tz(B, tz = timezone), by = .(B, timezone)]
dt
A B timezone B_local
1: 1 2016-03-24 17:15:12 America/Indiana/Vincennes 2016-03-24 19:15:12
2: 2 2016-03-24 17:15:12 Australia/North 2016-03-24 19:15:12
3: 3 2016-03-24 17:15:12 Pacific/Palau 2016-03-24 19:15:12
4: 4 2016-03-24 17:15:12 Antarctica/Macquarie 2016-03-24 19:15:12
5: 5 2016-03-24 17:15:12 Asia/Nicosia 2016-03-24 19:15:12
即使我将dt[, B_local := with_tz(B, tz = timezone), by = .(A)]
中的by = .(A)
更改为将表分为每一行,其输出也与上述相同.
Even if I change by = .(A)
in dt[, B_local := with_tz(B, tz = timezone), by = .(A)]
which subsets the table into each row, the output is identical to the above.
NB :我很乐意使用除lubridate
之外的其他内容,但我更愿意在data.table
内工作以提高效率,因为我有一个大型数据集.
NB: I'm more than happy to use something other than lubridate
but I'd prefer to work within data.table
for efficiencies as I have a large dataset.
推荐答案
这些东西超级凌乱而挑剔.我在包 RcppCCTZ 中写了一个时区"shifter",因为底层的CCTZ库使这变得可行/可能.
This stuff is super messy and finicky. I wrote a timezone 'shifter' in package RcppCCTZ as the underlying CCTZ library made that feasible / possible.
一个巨大警告:时区仅出现在格式化输出中,因此我在这里为您提供了一个解决方案,但目标输出现在是 text . 加上一个anytime()
解析,当然可以是POSIXct
(在您的本地TZ中).
One huge caveat: timezones appear only in the formatted output, so I have a solution for you here but the target output is now text. Edited: Which, with one more parse of anytime()
, can of course be POSIXct
(in your local TZ).
还要注意,我使用了随时中的帮助程序功能来设置时间.
Also note that I used a helper function from anytime to set the time.
suppressMessages({
library("data.table")
library("RcppCCTZ")
library("anytime")
})
dt <- data.table(A = 1:5,
B = rep(utctime("2016-03-24 17:15:12", tz="UTC"), 5),
timezone = c("America/Indiana/Vincennes", "Australia/North",
"Pacific/Palau", "Antarctica/Macquarie",
"Asia/Nicosia"))
dt[ , newTime := format(toTz(B, "UTC", timezone), tz=timezone), by=A ]
dt[ , pt := anytime(newTime), by=A ]
输出
R> dt <- data.table(A = 1:5,
+ B = rep(utctime("2016-03-24 17:15:12", tz="UTC"), 5),
+ timezone = c("America/Indiana/Vincennes", "Australia/North",
+ "Pacific/Palau", "Antarctica/Macquarie",
+ "Asia/Nicosia"))
R> dt[ , newTime := format(toTz(B, "UTC", timezone), tz=timezone), by=A ]
R> dt[ , pt := anytime(newTime), by=A ]
R> dt
A B timezone newTime pt
1: 1 2016-03-24 22:15:12 America/Indiana/Vincennes 2016-03-24 18:15:12 2016-03-24 18:15:12
2: 2 2016-03-24 22:15:12 Australia/North 2016-03-25 07:45:12 2016-03-25 07:45:12
3: 3 2016-03-24 22:15:12 Pacific/Palau 2016-03-25 07:15:12 2016-03-25 07:15:12
4: 4 2016-03-24 22:15:12 Antarctica/Macquarie 2016-03-25 09:15:12 2016-03-25 09:15:12
5: 5 2016-03-24 22:15:12 Asia/Nicosia 2016-03-25 00:15:12 2016-03-25 00:15:12
R>
这篇关于通过参考"lubridate"将UTC日期/时间调整为不同的时区.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!