从R中的数据框获取净值作为比例 [英] Getting net values as a proportion from a dataframe in R
本文介绍了从R中的数据框获取净值作为比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在R(p2.df)中有一个数据框,该数据框已将一系列值聚合为以下内容(还有更多列,这只是一个简写版本):
I have a dataframe in R (p2.df) that has aggregated a range of values into the following (there are many more columns this is just an abridge version):
genre rating cc dd ee
Adventure FAILURE 140393 20865 358806
Adventure SUCCESS 197182 32872 492874
Fiction FAILURE 140043 14833 308602
Fiction SUCCESS 197725 28848 469879
Sci-fi FAILURE 8681 1682 24259
Sci-fi SUCCESS 7439 1647 22661
我想获取每列比例的净值,我可以在电子表格中获得它,但在R studio中却不能。
I want to get the net values of the proportions for each column, which I can get in a spreadsheet but can't in R studio.
电子表格中的公式遵循以下模式:
The formula in the spreadsheet follows the pattern:
net_cc = (cc(success)/(cc(success)+dd(success)+ee(success)) - (cc(fail)/(cc(fail)+dd(fail)+ee(fail))
我想在R中得到的是可以从电子表格中获得的这张表:
What I want to get out in R is this table that I can get from the spreadsheet:
genre net_cc net_dd net_ee
Adventure 0.002801373059 0.005350579467 -0.008151952526
Fiction -0.01825346696 0.009417699223 0.008835767735
Sci-fi -0.01641517271 0.003297091109 0.0131180816
有什么想法吗?如果有什么用,我通过将上一张表总结为p2.df来创建它:
Any ideas how? If it's any use I created the p2.df by summarising a previous table as:
library(dplyr)
p2.df<- s2.df %>% group_by(genre,rating) %>% summarise_all(sum)
推荐答案
使用 tidyverse
:
library(tidyverse)
df %>% gather(,,3:5) %>%
spread(rating,value) %>%
group_by(genre) %>%
transmute(key,net = SUCCESS/sum(SUCCESS) - FAILURE/sum(FAILURE)) %>%
ungroup %>%
spread(key,net)
# # A tibble: 3 x 4
# genre cc dd ee
# <chr> <dbl> <dbl> <dbl>
# 1 Adventure 0.00280 0.00535 -0.00815
# 2 Fiction -0.0183 0.00942 0.00884
# 3 Sci-fi -0.0164 0.00330 0.0131
这篇关于从R中的数据框获取净值作为比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文