在 R 中按时间间隔对数据集进行分区 [英] Partitioning data set by time intervals in R
问题描述
我有一些按小时观察的数据.我试图按天甚至每周间隔对这些数据进行子集化.我不确定如何在 R
中继续执行此任务.
I have some observed data by hour. I am trying to subset this data by the day or even week intervals. I am not sure how to proceed with this task in R
.
数据样本如下.
date obs
2011-10-24 01:00:00 12
2011-10-24 02:00:00 4
2011-10-24 19:00:00 18
2011-10-24 20:00:00 7
2011-10-24 21:00:00 4
2011-10-24 22:00:00 2
2011-10-25 00:00:00 4
2011-10-25 01:00:00 2
2011-10-25 02:00:00 2
2011-10-25 15:00:00 12
2011-10-25 18:00:00 2
2011-10-25 19:00:00 3
2011-10-25 21:00:00 2
2011-10-25 23:00:00 9
2011-10-26 00:00:00 13
2011-10-26 01:00:00 11
推荐答案
首先我输入了多个空格替换为制表符的数据.
First I entered the data with the multiple spaces replaced with tabs.
dat$date <- as.POSIXct(dat$date, format="%Y-%m-%d %H:%M:%S")
split(dat , as.POSIXlt(dat$date)$yday)
# Notice these are not the same functions
#---------------------
$`296`
date obs
1 2011-10-24 01:00:00 12
2 2011-10-24 02:00:00 4
3 2011-10-24 19:00:00 18
4 2011-10-24 20:00:00 7
5 2011-10-24 21:00:00 4
6 2011-10-24 22:00:00 2
$`297`
date obs
7 2011-10-25 00:00:00 4
8 2011-10-25 01:00:00 2
9 2011-10-25 02:00:00 2
10 2011-10-25 15:00:00 12
11 2011-10-25 18:00:00 2
12 2011-10-25 19:00:00 3
13 2011-10-25 21:00:00 2
14 2011-10-25 23:00:00 9
$`298`
date obs
15 2011-10-26 00:00:00 13
16 2011-10-26 01:00:00 11
POSIXlt 类在数据帧中不能很好地工作,但它可以非常方便地创建基于时间的组.这是一个具有以下索引的列表结构:'yday'、'wday'、'year'、'mon'、'mday'、'hour'、'min'、'sec'和'isdt'.cut.POSIXt
函数在其他自然边界处添加分割;例如
The POSIXlt class does not work well inside dataframes but it can ve very handy for creating time based groups. It's a list structure with these indices: 'yday', 'wday', 'year', 'mon', 'mday', 'hour', 'min', 'sec' and 'isdt'. The cut.POSIXt
function adds divisions at other natural boundaries; E.g.
?cut.POSIXt
split(dat , cut(dat$date, "week") )
如果您想在日期内求和:
If you wanted to sum within date:
tapply(dat$obs, as.POSIXlt(dat$date)$yday, sum)
#-------
296 297 298
47 36 24
这篇关于在 R 中按时间间隔对数据集进行分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!