确定汽车和非汽车模式之间的最大时差为1小时 [英] Determining at most 1 hour time difference between car and non-car mode
本文介绍了确定汽车和非汽车模式之间的最大时差为1小时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有
household person time mode
1 1 07:45:00 non-car
1 1 09:05:00 car
1 2 08:10:00 non-car
1 3 22:45:00 non-car
1 4 08:30:00 car
1 5 22:00:00 car
2 1 07:45:00 non-car
2 2 16:45:00 car
我想找到一列,以了解每个家庭中非汽车模式是否最多比汽车模式早1小时。
I want to find a column to find if non-car mode is at most 1 hour before a car mode in each family.
我需要该列作为这次与另一个交集的一个或多个人的索引。
I need that column to be index of a person or persons who has this time intersection with another one.
在上面的示例第一个家庭中,第一人称时间比第4人晚1小时,因此在新栏中第4人为第一人称婴儿和第4人为1婴儿。
输出:
In the above example first family, the time of first person is 1 hour before person 4, so in new column 4 infant of first person and 1 infant of 4th person. output:
household person time mode overlap
1 1 07:45:00 non-car 4
1 1 09:05:00 car 2
1 2 08:10:00 non-car 4,1
1 3 22:45:00 non-car 0
1 4 08:30:00 car 1,2
1 5 22:00:00 car 0
2 1 07:45:00 non-car 0
2 2 16:45:00 car 0
与其他家庭成员的交集不为0或类似NA
no intersection with other family member is 0 or whatever like NA
推荐答案
这是一种 dplyr
方法,可产生那些匹配项。
Here's a dplyr
approach that produces those matches.
library(dplyr); library(hms)
df %>%
# Connect the table to itself, linking by household.
# So every row gets linked to every row (including itself)
# with the same household. The original data with end .x and
# the joined data will end .y, so we can compare then below.
left_join(df, by = c("household")) %>%
# Find the difference in time, in seconds
mutate(time_dif = abs(time.y - time.x)) %>%
filter(time_dif < 3600, # Keep if <1hr difference
person.x != person.y, # Keep if different person
mode.x != mode.y) %>% # Keep if different mode
# We have the answers now, everything below is for formatting
# Rename and hide some variables we don't need any more
select(household, person = person.x, time = time.x,
mode = mode.x, other = person.y) %>%
# Combine each person's overlaps into one row
group_by(household, person, time) %>%
summarise(overlaps = paste(other, collapse =","), times = length(other)) %>%
# Add back all original rows, even if no overlaps
right_join(df) %>%
ungroup()
## A tibble: 7 x 6
# household person time overlaps times mode
# <int> <int> <time> <chr> <int> <chr>
#1 1 1 07:45 4 1 non-car
#2 1 1 09:05 2 1 car
#3 1 2 08:10 1,4 2 non-car
#4 1 3 22:45 NA NA non-car
#5 1 4 08:30 1,2 2 car
#6 2 1 07:45 NA NA non-car
#7 2 2 16:45 NA NA car
这篇关于确定汽车和非汽车模式之间的最大时差为1小时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文