组对组划分 [英] Group to Group division
问题描述
数据集:
date bal
1/31/2013 10
1/31/2013 11
1/31/2013 12
1/31/2013 13
1/31/2013 14
2/28/2013 20
2/28/2013 30
2/28/2013 40
2/28/2013 50
2/28/2013 60
3/30/2013 10
3/30/2013 11
3/30/2013 12
3/30/2013 13
3/30/2013 15
使用的代码:
bb <- read.csv("abc.csv", stringsAsFactors=T, header=T)
bb
library(dplyr)
new_data <- bb %>%
mutate(D = (bal) / lag(bal[1:5])) %>%
data.frame()
new_data
我们将第2组划分(日期-2013年2月28日的第二行= 30)/(第1组-2013年1月1/31的第一行= 10)
即:30 / 10 = 3.000,40 / 11 = 3.63,50 / 12 = 4.16,依此类推。
We are dividing group 2 (dates - 2/28/2013's second row = 30) / (group 1 - 1/31/2013's first row = 10) that is: 30 / 10 = 3.000, 40/11 = 3.63, 50/12 = 4.16 and so on.
从上面的代码获得的输出:
Output got from the above code:
date bal D
1 1/31/2013 10 NA
2 1/31/2013 11 1.100000
3 1/31/2013 12 1.090909
4 1/31/2013 13 1.083333
5 1/31/2013 14 1.076923
6 2/28/2013 20 NA
7 2/28/2013 30 3.000000
8 2/28/2013 40 3.636364
9 2/28/2013 50 4.166667
10 2/28/2013 60 4.615385
11 3/30/2013 10 NA
12 3/30/2013 11 1.100000
13 3/30/2013 12 1.090909
14 3/30/2013 13 1.083333
15 3/30/2013 15 1.153846
现在这里的问题是:
第一组保留为参考=除数,即10、11、12、13
表示以下所有日期组(bal)被第一个参考组除。
The first group is kept as the reference = Divisor, that 10, 11,12,13 that means all the below groups of dates(bal) are getting divided by the first reference group.
我们希望每次除数在下一个分组日期前均与下个分组(除法)相同。
We want that each time the divisor should increament by next group date and same with the below group (divident) as so on.
date bal D
1 1/31/2013 10 NA
2 1/31/2013 11 NA
3 1/31/2013 12 NA
4 1/31/2013 13 NA
5 1/31/2013 14 NA
6 2/28/2013 20 NA
7 2/28/2013 30 3.000000 - 30 / 10 = 3
8 2/28/2013 40 3.636364 - 40 / 11 = 3.63
9 2/28/2013 50 4.166667 - 50 / 12 = 4.16
10 2/28/2013 60 4.615385 - 60 / 13 = 4.61
11 3/30/2013 10 NA NA
12 3/30/2013 11 1.100000 - 11 / 20 = 0.55
13 3/30/2013 12 1.090909 - 12 / 30 = 0.4
14 3/30/2013 13 1.083333 - 13 / 40 = 0.325
15 3/30/2013 15 1.153846 - 15 / 50 = 0.3
我期望以上输出。
推荐答案
DF %>%
group_by(g1=seq_along(bal) %% 5) %>%
mutate(denominator=lag(bal)) %>%
ungroup() %>%
group_by(g2=(seq_along(bal) - 1) %/% 5) %>%
mutate(denominator=lag(denominator),
D=bal / denominator) %>%
ungroup()
# # A tibble: 15 x 6
# date bal g1 denominator g2 D
# <fctr> <int> <dbl> <int> <dbl> <dbl>
# 1 1/31/2013 10 1 NA 0 NA
# 2 1/31/2013 11 2 NA 0 NA
# 3 1/31/2013 12 3 NA 0 NA
# 4 1/31/2013 13 4 NA 0 NA
# 5 1/31/2013 14 0 NA 0 NA
# 6 2/28/2013 20 1 NA 1 NA
# 7 2/28/2013 30 2 10 1 3.000000
# 8 2/28/2013 40 3 11 1 3.636364
# 9 2/28/2013 50 4 12 1 4.166667
# 10 2/28/2013 60 0 13 1 4.615385
# 11 3/30/2013 10 1 NA 2 NA
# 12 3/30/2013 11 2 20 2 0.550000
# 13 3/30/2013 12 3 30 2 0.400000
# 14 3/30/2013 13 4 40 2 0.325000
# 15 3/30/2013 15 0 50 2 0.300000
这篇关于组对组划分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!