排除R中的特定时间段 [英] Exclude specific time periods in R
问题描述
我是R的初学者,并尝试在某些时间段内搜索数据提取,但似乎找不到任何东西。
I'm a beginner with R and have tried searching for data extraction for certain time periods but can't seem to find anything.
我有一个时间序列连续数据以10分钟间隔测量,持续5个月。为简单起见,数据可以在两列中提供如下:
I have a time series of continuous data measured at 10 minute intervals for a period of five months. For simplicity's sake, the data is available in two columns as follows:
Timestamp Temp.Diff
2/14/2011 19:00 -0.385
2/14/2011 19:10 -0.535
2/14/2011 19:20 -0.484
2/14/2011 19:30 -0.409
2/14/2011 19:40 -0.385
2/14/2011 19:50 -0.215
...和在接下来的五个月里,我已经使用as.POSIXct()读入了Timestamp列。
... And it goes on for the next five months. I have read the Timestamp column using as.POSIXct() into R.
假设只有一天的某些时间对我有兴趣(例如从中午12点到下午3点),我想要排除当天的其他几个小时,或者只是提取3个小时,但仍然按照数据流顺序(即时间序列)。我明白,如果您知道行号,您可以轻松地对数据进行子集,但是由于这是一个更大的数据集,是否有一种方式来编码R,以便自动识别我查看的时间段?
Assuming that only certain times of the day are of interest to me, (e.g. from 12 noon to 3 PM), I would like either like to exclude the other hours of the day, OR just extract those 3 hours but still have the data flow sequentially (i.e. in a time series). I understand that you can easily subset data if you know the row numbers, but as this is a much larger dataset, is there a way to code R so it automatically recognises the time period I'm looking at?
推荐答案
你似乎知道基本的想法,但只是错过了细节。正如你所提到的,我们只是将Timestamps转换成POSIX对象,然后将子集。
You seem to know the basic idea, but are just missing the details. As you mentioned, we just transform the Timestamps into POSIX objects then subset.
lubridate解决方案
最简单的方法可能是使用lubridate。首先加载包:
The easiest way is probably with lubridate. First load the package:
library(lubridate)
接下来转换时间戳:
##*m*onth *d*ay *y*ear _ *h*our *m*inute
d = mdy_hm(dd$Timestamp)
然后我们选择我们想要的。在这种情况下,我希望在下午7点半之后的任何日期(不管一天):
Then we select what we want. In this case, I want any dates after 7:30pm (regardless of day):
dd[hour(d) == 19 & minute(d) > 30,]
/ strong>
Base R solution
首先创建一个上限:
lower = strptime("2/14/2011 19:30","%m/%d/%Y %H:%M")
下一步将POSIX对象中的时间戳变换:
Next transform the Timestamps in POSIX objects:
d = strptime(dd$Timestamp, "%m/%d/%Y %H:%M")
最后,一点数据帧子集:
Finally, a bit of dataframe subsetting:
dd[format(d,"%H:%M") > format(lower,"%H:%M"),]
感谢plannapus最后一部分
上述示例的数据:
dd = read.table(textConnection('Timestamp Temp.Diff
"2/14/2011 19:00" -0.385
"2/14/2011 19:10" -0.535
"2/14/2011 19:20" -0.484
"2/14/2011 19:30" -0.409
"2/14/2011 19:40" -0.385
"2/14/2011 19:50" -0.215'), header=TRUE)
这篇关于排除R中的特定时间段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!