每天dplyr计数观察 [英] dplyr count observations per day

查看:82
本文介绍了每天dplyr计数观察的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据

Name         Date                                   Message
Ted Foe      2011-06-10T05:06:30+0000               I love this product
Sina Fall    2011-06-10T05:07:33+0000               Not my type of product
Steve Hoe    2011-06-11T05:06:30+0000               Great Discussion! Thanks
Selda Dee    2011-06-13T05:12:30+0000               Seen elsewhere
Steven Hoe   2011-06-13T03:17:31+0000               Where?
Selda Dee    2011-06-13T05:17:56+0000               Tinder

我想按天汇总,这样我得出的时间序列就是这样

I want to aggregate by days so that I end up with a time series like this

Date            Number of Posts
2011-06-10      2
2011-06-11      1
2011-06-12      0
2011-06-13      3

我已经尝试了以下

summary_df <- df %>% group_by(Date) %>% summarise(comments = count(message))

但这不起作用。

感谢您的帮助!

干杯,拉乌尔

推荐答案

在转换为 Date 类后,按日期列分组行数( n())和 summary 。如果我们需要原始数据集中缺少的日期元素,请创建一个具有最小到最大日期顺序的新数据集,并执行 left_join

Grouped by the 'Date' column after converting to Date class, we get the number of rows (n()) with summarise. If we need the 'Date' elements that are missing in the original dataset, create a new dataset with the sequence of minimum to maximum 'Date' and do a left_join

df1 <- df %>%
          group_by(Date = as.Date(Date)) %>%
          summarise(comments = n())
expand.grid(Date = seq(min(df1$Date), max(df1$Date), by = '1 day')) %>%
         left_join(., df1)

这篇关于每天dplyr计数观察的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆