使用dplyr基于POSIXct日期和时间大于日期时间的子集数据帧 [英] Subset dataframe based on POSIXct date and time greater than datetime using dplyr
问题描述
我不确定选择日期时间作为POSIXct格式会出什么问题.我已经阅读了一些有关基于as.Date设置数据框子集的评论,并且可以使它正常工作.我也读过很多文章,建议过滤POSIXct格式应该可以,但是由于某种原因,我无法使它起作用.
I am not sure what is going wrong with selecting date times as a POSIXct format. I have read several comments on subsetting a dataframe based on as.Date and I can get that to work without an issue. I have also read many posts suggesting that filtering POSIXct formats should work, but for some reason I cannot get it to work.
示例数据框:
library(lubridate)
library(dplyr)
date_test <- seq(ymd_hms('2016-07-01 00:00:00'),ymd_hms('2016-08-01 00:00:00'), by = '15 min')
date_test <- data.frame(date_test)
date_test$datetime <- date_test$date_test
date_test <- select(date_test, -date_test)
我检查了它是否为POSIXct格式,然后尝试了几种方式对大于2016-07-01 01:15:00的数据框进行子集化.但是输出不会显示小于2016-07-01 01:15:00被删除的日期时间.很抱歉,如果有人问过这个问题,我找不到它,但是我已经寻找并试图使它生效.我将UTC用作时区以避免夏令时问题,所以这里不是问题-除非过滤器需要它.
I checked that it is in POSIXct format and then tried several ways to subset the dataframe greater than 2016-07-01 01:15:00. However the output never shows the date times less than 2016-07-01 01:15:00 being removed. I am sorry if this has been asked somewhere and I cannot find it but I have looked and tried to get this to work. I am using UTC as the timezone to avoid daylight savings time issues so that is not the issue here - unless the filter requires it.
class(date_test$datetime)
date_test <- date_test %>% filter(datetime > '2016-07-01 01:15:00')
date_test <- date_test %>%
filter(datetime > as.POSIXct("2016-07-01 00:15"))
date_test <- subset(date_test, datetime > as.POSIXct('2016-07-01 01:15:00'))
现在,如果我使用以下方法进行过滤:
Now if I filter using:
date_test <- date_test %>%
filter(datetime > as.POSIXct("2016-07-10 01:15:00"))
输出是很奇怪的,因为时间错了一天?
the output is very strange with a day behind and the wrong time?
2016-07-09 13:30:00
2016-07-09 13:45:00
2016-07-09 14:00:00
2016-07-09 14:15:00
2016-07-09 14:30:00
如果有帮助,我将Mac OS Sierra与R Studio版本1.0.143和R You Stupid Darkness,DPLYR 0.5和Lubridate 1.6一起使用
If it helps I am using MAC OS Sierra with R Studio Version 1.0.143 and R You Stupid Darkness, DPLYR 0.5 and Lubridate 1.6
推荐答案
ymd_hms
默认在"UTC"时区使用POSIXct
次-as.POSIXct
使用系统时区(例如-澳大利亚对我而言)-您需要按照评论中Dave的建议,始终使用ymd_hms
或更改为"UTC"时区.
ymd_hms
uses POSIXct
times in "UTC" timezone by default - as.POSIXct
uses the system timezone (e.g. - Australia for me) - you need to consistently use ymd_hms
or change to the "UTC" timezone as per Dave's suggestion in the comments.
例如:这些示例有效:
date_test <- seq(ymd_hms('2016-07-01 00:30:00'),ymd_hms('2016-07-01 01:30:00'), by = '15 min')
date_test <- data.frame(datetime=date_test)
date_test
# datetime
#1 2016-07-01 00:30:00
#2 2016-07-01 00:45:00
#3 2016-07-01 01:00:00
#4 2016-07-01 01:15:00
#5 2016-07-01 01:30:00
date_test %>%
filter(datetime > as.POSIXct("2016-07-01 01:00:00", tz="UTC"))
date_test %>%
filter(datetime > ymd_hms("2016-07-01 01:00:00"))
# datetime
#1 2016-07-01 01:15:00
#2 2016-07-01 01:30:00
这篇关于使用dplyr基于POSIXct日期和时间大于日期时间的子集数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!