as.Date(as.POSIXct()) 给出错误的日期? [英] as.Date(as.POSIXct()) gives the wrong date?
问题描述
我一直在尝试查看一个数据框,提取 POSIXct 列的日期组件与某个值匹配的所有行.我遇到了以下内容,这让我非常困惑:: as.Date(as.POSIXct(...))
并不总是返回正确的日期.
I'd been trying to look through a dataframe extracting all rows where the date component of a POSIXct column matched a certain value.I came across the following which is confusing me mightily:: as.Date(as.POSIXct(...))
doesn't always return the correct date.
> dt <- as.POSIXct('2012-08-06 09:35:23')
[1] "2012-08-06 09:35:23 EST"
> as.Date(dt)
[1] "2012-08-05"
为什么'2012-08-06 09:35:23'的日期等于'2012-08-05?
Why is the date of '2012-08-06 09:35:23' equal to '2012-08-05?
我怀疑这与所使用的不同时区有关,因此请注意 dt
的时区是EST",我将其提供给 as.Date
::
I suspect it's something to do with different timezones being used, so noting that the timezone of dt
was 'EST' I gave this to as.Date
::
> as.Date(as.POSIXct('2012-08-06 09:35:23'), tz='EST')
[1] "2012-08-05"
但它仍然返回 2012-08-05.
But it still returns 2012-08-05.
这是为什么?如何在我的数据框中找到日期为 2012-08-06 的所有日期时间?(as subset(my.df, as.character(as.Date(datetime), tz='EST') == '2012-08-06')
不返回日期时间 dt
即使这确实发生在日期 2012-08-06...)?
Why is this? How can I find all datetimes in my dataframe that were on the date 2012-08-06? (as subset(my.df, as.character(as.Date(datetime), tz='EST') == '2012-08-06')
does not return the row with datetime dt
even though this did occur on the date 2012-08-06...)?
添加的细节:Linux 64bit(虽然可以在 32bit 上重现),可以在 R 3.0.1 和3.0.0,我目前是 AEST(澳大利亚东部标准时间)
Added details: Linux 64bit (though can reproduce on 32bit), can get this on both R 3.0.1 & 3.0.0, and I am currently AEST (Australian Eastern Standard Time)
推荐答案
安全的方法是通过 format
传递日期值.这确实创建了一个额外的步骤,但 as.Date
将接受使用-"或/"格式的字符结果:
The safe way to do this is to pass the date value through format
. This does create an additional step but as.Date
will accept the character result if it is formated with a "-" or "/":
as.Date( format( as.POSIXct('2019-03-11 23:59:59'), "%Y-%m-%d") )
[1] "2019-03-11"
as.Date( as.POSIXct('2019-03-11 23:59:59') ) # I'm in a locale where the problem might exist
[1] "2019-03-12"
时区的文档也让我感到困惑.在某些情况下(事实证明在这种情况下),EST 可能并不明确,实际上可能指的是澳大利亚的 tz.如果您恰好在北美,请尝试EST5EDT"或America/New_York".
The documentation for timezones is confusing to me too. In some (and this case as it turned out) case EST may not be unambiguous and may actually refer to a tz in Australia. Try "EST5EDT" or "America/New_York" if you happen to be in North America.
在这种情况下,它还可能与您未说明的操作系统处理tz"参数的方式不同有关,因为我得到2012-08-06".(我目前在 PDT US tz 中,虽然我不确定这是否重要.)更改获取 tz 参数的函数可能会澄清(或不澄清):
In this case it could also relate to differences in how your unstated OS handles the 'tz' argument, since I get "2012-08-06". ( I'm in PDT US tz at the moment, although I'm not sure that should matter. )Changing which function gets the tz argument may clarify (or not):
> as.Date(as.POSIXct('2012-08-06 19:35:23', tz='EST'))
[1] "2012-08-07"
> as.Date(as.POSIXct('2012-08-06 17:35:23', tz='EST'))
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 21:35:23'), tz='EST')
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 22:35:23'), tz='EST')
[1] "2012-08-07"
如果您从 as.POSIXct
中省略 tz,则假定为 UTC.
If you omit the tz from as.POSIXct
then UTC is assumed.
这些是 Ozzie TZ 的明确名称(至少在我的 Mac 上):
These are the unambiguous names of the Ozzie TZ's (at least on my Mac):
tzfile <- "/usr/share/zoneinfo/zone.tab"
tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
col.names = c("country", "coords", "name", "comments"),
as.is = TRUE, fill = TRUE, comment.char = "#")
grep("^Aus", tzones$name, value=TRUE)
[1] "Australia/Lord_Howe" "Australia/Hobart"
[3] "Australia/Currie" "Australia/Melbourne"
[5] "Australia/Sydney" "Australia/Broken_Hill"
[7] "Australia/Brisbane" "Australia/Lindeman"
[9] "Australia/Adelaide" "Australia/Darwin"
[11] "Australia/Perth" "Australia/Eucla"
这篇关于as.Date(as.POSIXct()) 给出错误的日期?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!