分组并计数以获得接近的价格 [英] Grouping and counting to get a closerate

查看:61
本文介绍了分组并计数以获得接近的价格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想按每个国家计数状态打开的次数状态关闭 的次数。然后计算每个国家的收盘价

I want to count per country the number of times the status is open and the number of times the status is closed. Then calculate the closerate per country.

数据:

customer <- c(1,2,3,4,5,6,7,8,9)
country <- c('BE', 'NL', 'NL','NL','BE','NL','BE','BE','NL')
closeday <- c('2017-08-23', '2017-08-05', '2017-08-22', '2017-08-26', 
'2017-08-25', '2017-08-13', '2017-08-30', '2017-08-05', '2017-08-23')
closeday <- as.Date(closeday)

df <- data.frame(customer,country,closeday)

添加状态

df$status <- ifelse(df$closeday < '2017-08-20', 'open', 'closed') 

  customer country   closeday status
1        1      BE 2017-08-23 closed
2        2      NL 2017-08-05   open
3        3      NL 2017-08-22 closed
4        4      NL 2017-08-26 closed
5        5      BE 2017-08-25 closed
6        6      NL 2017-08-13   open
7        7      BE 2017-08-30 closed
8        8      BE 2017-08-05   open
9        9      NL 2017-08-23 closed

计算关闭率

closerate <- length(which(df$status == 'closed')) / 
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))

[1] 0.6666667

很明显,这是总额的收盘价。面临的挑战是要获得每个国家收盘价。我尝试通过以下方式将关闭率计算添加到 df 中:

Obviously, this is the closerate for the total. The challenge is to get the closerate per country. I tried adding the closerate calculation to df by:

df$closerate <- length(which(df$status == 'closed')) / 
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))

但是它给出了所有行关闭率为0.66,因为我没有分组。我相信我不应该使用长度函数,因为可以通过分组来完成计数。我阅读了一些有关使用 dplyr 来计算每个组的逻辑输出的信息,但这没有解决。

But it gives all lines a closerate of 0.66 because I'm not grouping. I believe I should not use the length function because counting can be done by grouping. I read some information about using dplyr to count logical outputs per group but this didn't work out.

这是所需的输出:

推荐答案

aggregate(list(output = df$status == "closed"),
          list(country = df$country),
          function(x)
              c(close = sum(x),
                open = length(x) - sum(x),
                rate = mean(x)))
#  country output.close output.open output.rate
#1      BE         3.00        1.00        0.75
#2      NL         3.00        2.00        0.60






在注释中使用了的解决方案,似乎已被删除。无论如何,您也可以使用 table


There was a solution using table in the comments which appears to have been deleted. Anyway, you could also use table

output = as.data.frame.matrix(table(df$country, df$status))
output$closerate = output$closed/(output$closed + output$open)
output
#   closed open closerate
#BE      3    1      0.75
#NL      3    2      0.60

这篇关于分组并计数以获得接近的价格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆