确定汽车和非汽车模式之间的最大时差为1小时 [英] Determining at most 1 hour time difference between car and non-car mode

查看:49
本文介绍了确定汽车和非汽车模式之间的最大时差为1小时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有

 household       person     time           mode
      1           1          07:45:00        non-car
      1           1          09:05:00         car
      1           2          08:10:00         non-car
      1           3          22:45:00       non-car
      1           4          08:30:00         car
      1           5          22:00:00         car
      2           1          07:45:00        non-car
      2           2          16:45:00       car

我想找到一列,以了解每个家庭中非汽车模式是否最多比汽车模式早1小时。

I want to find a column to find if non-car mode is at most 1 hour before a car mode in each family.

我需要该列作为这次与另一个交集的一个或多个人的索引。

I need that column to be index of a person or persons who has this time intersection with another one.

在上面的示例第一个家庭中,第一人称时间比第4人晚1小时,因此在新栏中第4人为第一人称婴儿和第4人为1婴儿。
输出:

In the above example first family, the time of first person is 1 hour before person 4, so in new column 4 infant of first person and 1 infant of 4th person. output:

 household       person     time           mode             overlap
      1           1          07:45:00        non-car           4
      1           1          09:05:00        car               2
      1           2          08:10:00        non-car           4,1
      1           3          22:45:00        non-car           0
      1           4          08:30:00        car               1,2
      1           5          22:00:00         car              0
      2           1          07:45:00        non-car            0
      2           2          16:45:00        car                0

与其他家庭成员的交集不为0或类似NA

no intersection with other family member is 0 or whatever like NA

推荐答案

这是一种 dplyr 方法,可产生那些匹配项。

Here's a dplyr approach that produces those matches.

library(dplyr); library(hms)
df %>%
  # Connect the table to itself, linking by household.
  #   So every row gets linked to every row (including itself)
  #   with the same household. The original data with end .x and 
  #   the joined data will end .y, so we can compare then below.
  left_join(df, by = c("household")) %>%
  # Find the difference in time, in seconds
  mutate(time_dif = abs(time.y - time.x)) %>%
  filter(time_dif < 3600,       # Keep if <1hr difference
         person.x != person.y,  # Keep if different person
         mode.x != mode.y) %>%  # Keep if different mode

  # We have the answers now, everything below is for formatting

  # Rename and hide some variables we don't need any more
  select(household, person = person.x, time = time.x, 
         mode = mode.x, other = person.y) %>%
  # Combine each person's overlaps into one row
  group_by(household, person, time) %>%
  summarise(overlaps  = paste(other, collapse =","), times = length(other)) %>%
  # Add back all original rows, even if no overlaps
  right_join(df) %>%
  ungroup()


## A tibble: 7 x 6
#  household person time   overlaps times mode   
#      <int>  <int> <time> <chr>    <int> <chr>  
#1         1      1 07:45  4            1 non-car
#2         1      1 09:05  2            1 car    
#3         1      2 08:10  1,4          2 non-car
#4         1      3 22:45  NA          NA non-car
#5         1      4 08:30  1,2          2 car    
#6         2      1 07:45  NA          NA non-car
#7         2      2 16:45  NA          NA car   

这篇关于确定汽车和非汽车模式之间的最大时差为1小时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆