使用dplyr组合并使用迭代过滤器进行汇总 [英] Group and summarize with iterative filter using dplyr
问题描述
如果已经提出了这个问题,我们一直在搜索,没有找到可以应用于我的问题的答案。
Upfront apology if this has been asked, I have been searching all day and have not found an answer I can apply to my problem.
我试图使用dplyr(和co。)解决这个问题,因为我以前的方法(for循环)太低效了。我有一个事件时间的数据集,在站点,分组。我想总结沿序列移动窗口中发生的事件的数量(和比例)。
I am trying to solve this issue using dplyr (and co.) because my previous method (for loops) was too inefficient. I have a dataset of event times, at sites, that are in groups. I want to summarize the number (and proportion) of events that occur in a moving window along a sequence.
# Example data
set.seed(1)
sites = rep(letters[1:10],10)
groups = c('red','blue','green','yellow')
times = round(runif(length(sites),1,100))
timePeriod = seq(1,100)
# Example dataframe
df = data.frame(site = sites,
group = rep(groups,length(sites)/length(groups)),
time = times)
这是我尝试总结在给定的移动时间窗口中包含时间(事件)的每个组中的站点数。
目标是移动向量 timePeriod
的每个元素,并总结每个组中的事件发生在 timePeriod [i] + / - 半窗口
。最终将它们存储在例如每个组中的列的数据框中,并且每个时间步的行都是理想的。
This is my attempt to summarize the number of sites from each group that contain a time (event) within a given moving window of time.
The goal is to move through each element of the vector timePeriod
and summarize how many events in each group occurred at timePeriod[i] +/- half-window
. Ultimately storing them in, e.g., a dataframe with a column for each group, and a row for each time step, is ideal.
df %>%
filter(time > timePeriod[i]-25 & time < timePeriod[i]+25) %>%
group_by(group) %>%
summarise(count = n())
如何循环遍历我的序列的时间并分别存储每个组的汇总表?谢谢!
How can I do this without looping through my sequence of time and storing the summary table for each group individually? Thanks!
推荐答案
结合 lapply
和 dplyr
,您可以执行以下操作,这与您迄今为止所做的工作相近。
Combining lapply
and dplyr
, you can do the following, which is close to what you had worked so far.
lapply(timePeriod, function(i){
df %>%
filter(time > (i - 25) & time < ( i + 25 ) ) %>%
group_by(group) %>%
summarise(count = n()) %>%
mutate(step = i)
}) %>%
bind_rows()
这篇关于使用dplyr组合并使用迭代过滤器进行汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!