从每个先前值计数行 [英] count rows from every previous value

查看:81
本文介绍了从每个先前值计数行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题与此处相同,但是这次我想将每个之前的计数值,而不是本身。因此,对于第一个值(1500),我们将得到NA,因为在此之前没有其他值。然后,将1100除以4,因为先前值(1500)的计数为4。然后,将200除以3,因为先前值(1100)的计数为3。最后,将1100除以2,因为200的计数为2我尝试使用shift / lag,但无法成功!

This question is the same as here but this time I want to divide every value by the previous count, not itself. So, for the first value (1500) we will have NA because there is no other value before that. Then, we will divide 1100 by 4 because the count of previous value (1500) is 4. Then, we will divide 200 by 3 because the previous value (1100) has count 3. Last, divide 1100 by 2 because 200 has count 2. I tried to use shift/lag but can't succeed!

这是将每个值除以自己的计数的代码。

This is the code that divides every value with its own count.

library(dplyr)
library(tidyverse)


df <- tibble(mydate = as.Date(c("2019-05-11 23:01:00", "2019-05-11 23:02:00", "2019-05-11 23:03:00", "2019-05-11 23:04:00",
                                "2019-05-12 23:05:00", "2019-05-12 23:06:00", "2019-05-12 23:07:00", "2019-05-12 23:08:00",
                                "2019-05-13 23:09:00", "2019-05-13 23:10:00", "2019-05-13 23:11:00", "2019-05-13 23:12:00",
                                "2019-05-14 23:13:00", "2019-05-14 23:14:00", "2019-05-14 23:15:00", "2019-05-14 23:16:00",
                                "2019-05-15 23:17:00", "2019-05-15 23:18:00", "2019-05-15 23:19:00", "2019-05-15 23:20:00")),
               myval = c(0, NA, 1500, 1500,
                         1500, 1500, NA, 0,
                         0, 0, 1100, 1100,
                         1100, 0, 200, 200,
                         1100, 1100, 1100, 0
               ))

# just replace values [0,1] with NA
df$myval[df$myval >= 0 & df$myval <= 1] <- NA


df <- df %>%
  group_by(grp = data.table::rleid(myval)) %>%
  mutate(counts = n(), 
         result= myval/counts)


#   mydate     myval   grp counts result
#   <date>     <dbl> <int>  <int>  <dbl>
# 1 2019-05-11    NA     1      2    NA 
# 2 2019-05-11    NA     1      2    NA 
# 3 2019-05-11  1500     2      4   375 
# 4 2019-05-11  1500     2      4   375 
# 5 2019-05-12  1500     2      4   375 
# 6 2019-05-12  1500     2      4   375 
# 7 2019-05-12    NA     3      4    NA 
# 8 2019-05-12    NA     3      4    NA 
# 9 2019-05-13    NA     3      4    NA 
#10 2019-05-13    NA     3      4    NA 
#11 2019-05-13  1100     4      3   367.
#12 2019-05-13  1100     4      3   367.
#13 2019-05-14  1100     4      3   367.
#14 2019-05-14    NA     5      1    NA 
#15 2019-05-14   200     6      2   100 
#16 2019-05-14   200     6      2   100 
#17 2019-05-15  1100     7      3   367.
#18 2019-05-15  1100     7      3   367.
#19 2019-05-15  1100     7      3   367.
#20 2019-05-15    NA     8      1    NA 

我想保留上述数据框,并使用数据es列和正确的结果。

I want to preserve the above dataframe, with the dates column and the correct result.

推荐答案

这是一种方法:

library(dplyr)
#Create a group number
df1 <- df %>% mutate(grp = data.table::rleid(myval))

df1 %>%
  #Keep only non-NA value
  filter(!is.na(myval)) %>%
  #count occurence of each grp
  count(grp, name = 'count') %>%
  #Shift the count to the previous group
  mutate(count = lag(count)) %>%
  #Join with the original data
  right_join(df1, by = 'grp') %>%
  #divide the count to get final result
  mutate(result = myval/count) %>%
  arrange(grp)

返回

# A tibble: 20 x 5
#     grp count mydate     myval result
#   <int> <int> <date>     <dbl>  <dbl>
# 1     1    NA 2019-05-11    NA   NA  
# 2     1    NA 2019-05-11    NA   NA  
# 3     2    NA 2019-05-11  1500   NA  
# 4     2    NA 2019-05-11  1500   NA  
# 5     2    NA 2019-05-12  1500   NA  
# 6     2    NA 2019-05-12  1500   NA  
# 7     3    NA 2019-05-12    NA   NA  
# 8     3    NA 2019-05-12    NA   NA  
# 9     3    NA 2019-05-13    NA   NA  
#10     3    NA 2019-05-13    NA   NA  
#11     4     4 2019-05-13  1100  275  
#12     4     4 2019-05-13  1100  275  
#13     4     4 2019-05-14  1100  275  
#14     5    NA 2019-05-14    NA   NA  
#15     6     3 2019-05-14   200   66.7
#16     6     3 2019-05-14   200   66.7
#17     7     2 2019-05-15  1100  550  
#18     7     2 2019-05-15  1100  550  
#19     7     2 2019-05-15  1100  550  
#20     8    NA 2019-05-15    NA   NA  

这篇关于从每个先前值计数行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆