根据R中的开始/结束时间绘制出现频率 [英] Plotting frequency of occurrences based on start/end times in R

查看:41
本文介绍了根据R中的开始/结束时间绘制出现频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个行程"数据集,其中包含唯一的行程ID,以及行程的开始和结束时间(特定的小时和分钟).这些旅行都是在同一天进行的.我正在尝试确定在任何给定时间的道路上的汽车数量,并使用R中的ggplot将其绘制为线形图.换句话说,汽车在开始和结束时间之间的任何时间都在道路上".

I have a "trips" dataset that includes a unique trip id, and a start and end time (the specific hour and minute) of the trips. These trips were all taken on the same day. I am trying to determine the number of cars on the road at any given time and plot it as a line graph using ggplot in R. In other words, a car is "on the road" at any time in between its start and end time.

我能找到的最相似的示例使用以下结构:

The most similar example I can find uses the following structure:

yearly_counts <- trips %>%
                 count(year, trip_id)

ggplot(data = yearly_counts, mapping = aes(x = year, y = n)) +
     geom_line()

最好的方法是修改此结构,使其具有一个"minutesByHour_count"变量,该变量具有每小时的每分钟计数吗?对我来说,这似乎效率低下,但仍无法解决从开始/结束时间获取计数的问题.

Would the best approach be to modify this structure have an "minutesByHour_count" variable that has a count for every minute of every hour? This seems inefficient to me, and still doesn't solve the problem of getting the counts from the start/end time.

有没有更简单的方法?

推荐答案

下面是一个示例,该示例基于将每个起点算作一辆额外的汽车,并将每个终点算作计数的减少:

Here's an example based on counting each start as an additional car, and each end as a reduction in the count:

library(tidyverse)
df %>%
  gather(type, time, c(start_hour, end_hour)) %>%
  mutate(count_chg = if_else(type == "start_hour", 1, -1)) %>%
  arrange(time) %>%
  mutate(car_count = cumsum(count_chg)) %>%
  ggplot(aes(time, car_count)) +
  geom_step()

样本数据:

df <- data.frame(
  uniqueID = 1:60,
  start_hour = seq(8, 12, length.out = 60),
  dur_hour = 0.05*1:60
)
df$end_hour = df$start_hour + df$dur_hour
df$dur_hour = NULL

这篇关于根据R中的开始/结束时间绘制出现频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆