汇总数据并保留日期列值 [英] Summarize data and keep date column value

查看：80 发布时间：2020/10/18 6:43:54 r date sum

本文介绍了汇总数据并保留日期列值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我之前曾问过类似的问题，并且得到了很好的答案，但需要更多有关汇总和日期的指导。使用dplyr汇总并计数R中的数据

I asked a similar question before and got an excellent answer, but needed some more guidance on the topic of summarizing and dates. Summarize and count data in R with dplyr

目标：

在我的新数据集中，我有列日期，事件发生的时间。当我想按照另一篇文章中的建议继续进行示例操作时，我收到一条错误消息：

In my new dataset i have column with dates, when the event occured. When i want to proceed in the example as suggested in the other post, I get an error message:

数据集：

structure(list(User = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,  2L, 2L, 2L),
Date = c("25.11.2015 13:59", "03.12.2015 09:32",  "07.12.2015 08:18", "08.12.2015 19:40", "08.12.2015 19:40",
"22.12.2015 08:50",  "22.12.2015 08:52", "05.01.2016 13:22", 
"06.01.2016 09:18", "14.02.2016 22:47",  
"20.02.2016 21:27", "01.04.2016 13:52", "24.07.2016 07:03"), 
    StimuliA = c(0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 1L), StimuliB = c(0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 
    1L, 0L, 0L, 0L), R2 = c(1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 1L, 1L, 0L), R3 = c(0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 
    0L, 0L, 0L, 0L), R4 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L), R5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L), R6 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 
    0L, 0L, 0L, 0L), R7 = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 
    0L, 0L, 0L, 0L), stims = c("0_0", "0_0", "1_0", "1_0", "1_1", 
    "1_1", "1_1", "1_1", "1_1", "1_2", "1_2", "1_2", "2_2")), .Names = c("User",  "Date", "StimuliA", "StimuliB", "R2", "R3", "R4", "R5", "R6",  "R7", "stims"), row.names = c(NA, -13L), spec = structure(list(
    cols = structure(list(User = structure(list(), class = c("collector_integer", 
    "collector")), Date = structure(list(), class = c("collector_character", 
    "collector")), StimuliA = structure(list(), class = c("collector_integer", 
    "collector")), StimuliB = structure(list(), class = c("collector_integer", 
    "collector")), R2 = structure(list(), class = c("collector_integer", 
    "collector")), R3 = structure(list(), class = c("collector_integer", 
    "collector")), R4 = structure(list(), class = c("collector_integer", 
    "collector")), R5 = structure(list(), class = c("collector_integer", 
    "collector")), R6 = structure(list(), class = c("collector_integer", 
    "collector")), R7 = structure(list(), class = c("collector_integer", 
    "collector"))), .Names = c("User", "Date", "StimuliA", "StimuliB", 
    "R2", "R3", "R4", "R5", "R6", "R7")), default = structure(list(), class = c("collector_guess", 
    "collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df",  "tbl", "data.frame"))

代码：

df$stims <- with(df, paste(cumsum(StimuliA), cumsum(StimuliB), sep="_"))    
aggregate(. ~ User + stims, data=df, sum)
Error in Summary.factor(c(12L, 2L), na.rm = FALSE) : 
‘sum’ not meaningful for factors

问题/所需结果：
我想保留刺激发生的日期（或刺激A和B为0，然后是特定用户的第一个日期）

Question/Desired result: In my result, I would like to keep the date of when the Stimuli occured (or when stimuli A and B are 0, then of the first date of the specific user)

User    Date         StimuliA   StimuliB    R2  R3  R4  R5  R6  R7
 1  25.11.2015 13:59     0         0        1   0   0   0   0   1
 1  07.12.2015 08:18     1         0        0   0   0   0   1   0
 1  08.12.2015 19:40     0         1        0   2   0   0   1   1
 2  05.01.2016 13:22     0         0        0   0   0   0   1   0 
 2  14.02.2016 22:47     0         1        2   0   0   0   0   0
 2  24.07.2016 07:03     1         0        0   0   0   0   0   0

在此结果表中，当刺激A和B为st时，我们得到值的总和（R2-R7）生病0。[Line1]然后，对于每个刺激，直到下一个刺激发生之前，都会记录R2-R7的总和。

In this result table, we have the sum of the values (R2-R7), when Stimuli A and B are still 0. [Line1] Then for each Stimuli, there is the sum of R2-R7 noted until the next Stimuli occurs.

这在上一篇文章中已提出，但是我无法使其工作：

This was suggested in the previous post, but I am unable to make it work:

您不想使用日期作为因素。使用as.Date将日期转换为
Date变量（有关SO的许多文章）。然后，一种方法
将是用户和类似于上面的
刺激分别汇总日期变量，采用最小值而不是总和。然后合并
两个结果data.frames。如果这没有道理，则可能值得
提出一个链接到该问题的新问题，并在date变量中增加
个问题。还包括一个示例
数据集，该数据集包含此变量@lmo

You don't want to work with dates as factors. Transform the date to a Date variable using as.Date (many posts on this on SO). One method then would be to separately aggregate the date variable by User and stims similar to above, taking the min rather than the sum. Then merge the two resulting data.frames. If this does not make sense, it might be worth asking a new question that links to this question, adding the additional problem of the date variable. Also include an example dataset that includes this variable @lmo

汇总数据并保留日期列值 [英] Summarize data and keep date column value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

汇总数据并保留日期列值 [英] Summarize data and keep date column value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭