下一条记录的索引 [英] Index of next occurring record

查看：90 发布时间：2020/10/15 20:07:42 r dplyr data.table

本文介绍了下一条记录的索引的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个自行车轨迹的样本数据集。我的目标是找出访问B站之间的平均时间。

I have a sample dataset of the trajectory of one bike. My objective is to figure out, on average, the amount of time that lapses in between visits to station B.

到目前为止，我已经能够简单地对数据集进行排序

So far, I have been able to simply order the dataset with:

test[order(test$starttime, decreasing = FALSE),]

并找到 start_station 和 end_station 等于B。


 which(test$start_station == 'B')
 which(test$end_station == 'B')

下一部分是我遇到麻烦的地方。为了计算自行车在B站之间的时间间隔，我们必须将 difftime（）取为 start_station = B  （自行车离开）和下一条发生记录，其中 end_station = B ，即使记录恰好位于同一行（请参阅第6行）。
The next part is where I run into trouble.  In order to calculate the time that lapses in between when the bike is at Station B, we must take the difftime() between where start_station = "B" (bike leaves) and the next occurring record  where end_station= "B",  even if the record happens to be in the same row (see row 6).

使用下面的数据集，我们知道自行车在<$ c之间花费了510分钟B站外的$ c> 7:30:00 和 16:00:00 ，距 18 30分钟： B站外的00:00 和 18:30:00 ，以及 19:00:00 和 22:30:00 在车站B之外，平均时间为 250分钟。

Using the dataset below, we know that the bike spent 510 minutes between 7:30:00 and 16:00:00 outside of Station B, 30 minutes between 18:00:00 and 18:30:00 outside of Station B, and 210 minutes between 19:00:00 and 22:30:00 outside of Station B, which averages to 250 minutes.

如何使用 difftime（）在R中重现此输出？

How would one reproduce this output in R using difftime()?

> test
   bikeid start_station           starttime end_station             endtime
1       1             A 2017-09-25 01:00:00           B 2017-09-25 01:30:00
2       1             B 2017-09-25 07:30:00           C 2017-09-25 08:00:00
3       1             C 2017-09-25 10:00:00           A 2017-09-25 10:30:00
4       1             A 2017-09-25 13:00:00           C 2017-09-25 13:30:00
5       1             C 2017-09-25 15:30:00           B 2017-09-25 16:00:00
6       1             B 2017-09-25 18:00:00           B 2017-09-25 18:30:00
7       1             B 2017-09-25 19:00:00           A 2017-09-25 19:30:00
8       1             А 2017-09-25 20:00:00           C 2017-09-25 20:30:00
9       1             C 2017-09-25 22:00:00           B 2017-09-25 22:30:00
10      1             B 2017-09-25 23:00:00           C 2017-09-25 23:30:00

以下是示例数据：

> dput(test)
structure(list(bikeid = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1), start_station = c("A", 
"B", "C", "A", "C", "B", "B", "А", "C", "B"), starttime = structure(c(1506315600, 
1506339000, 1506348000, 1506358800, 1506367800, 1506376800, 1506380400, 
1506384000, 1506391200, 1506394800), class = c("POSIXct", "POSIXt"
), tzone = ""), end_station = c("B", "C", "A", "C", "B", "B", 
"A", "C", "B", "C"), endtime = structure(c(1506317400, 1506340800, 
1506349800, 1506360600, 1506369600, 1506378600, 1506382200, 1506385800, 
1506393000, 1506396600), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("bikeid", 
"start_station", "starttime", "end_station", "endtime"), row.names = c(NA, 
-10L), class = "data.frame")

下一条记录的索引 [英] Index of next occurring record

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

下一条记录的索引 [英] Index of next occurring record

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭