使用dplyr计算行比率 [英] using dplyr calculate row ratio

查看:126
本文介绍了使用dplyr计算行比率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个df:

id  sample1_1   sample1_2   sample2_1   sample2_2   sample2_3   sample3_1   sample3_2
honda   4.464274    7.087345    2.659297    83.513596   49.299961   22.991566   19.679316
audi    1.454645    2.784645    2.692656    14.010951   7.674361    3.84253 3.795233

我想做的是计算

ratio =4.464274/(4.464274+1.454645)*100 for each sample between honda and audi.

每行并将其绑定到新的df

for each row and bind it to new df

id  sample1_1   sample1_2   sample2_1   sample2_2   sample2_3   sample3_1   sample3_2 ratio_sample1_1...sample3_1
    honda   4.464274    7.087345    2.659297    83.513596   49.299961   22.991566   19.679316
    audi    1.454645    2.784645    2.692656    14.010951   7.674361    3.84253 3.795233 

是否有任何简单的方法?

Is there any easy way to do this?

样本重复量的标准偏差类似,但对于每个样本组

standard deviation for sample replicates somthing like this but for each sample group

sample1_1_ratio     sample1_2_ratio     STD
75  71  sd(sample1_1_ratio,sample1_2_ratio) 
24  28  sd(sample1_1_ratio,sample1_2_ratio)


推荐答案

这是获得相同结果的略有不同的解决方案,但以更易于管理的长格式组织数据框:

Here is a slightly different solution to get the same results, but organizing the data frame in a more manageable long format:

library(dplyr)
library(tidyr)
df %>%
  gather(sample, value, -id) %>%
  group_by(sample) %>%
  mutate(ratio = value / sum(value) * 100)
# A tibble: 14 x 4
# Groups:   sample [7]
       id    sample     value    ratio
   <fctr>     <chr>     <dbl>    <dbl>
 1  honda sample1_1  4.464274 75.42381
 2   audi sample1_1  1.454645 24.57619
 3  honda sample1_2  7.087345 71.79247
 4   audi sample1_2  2.784645 28.20753
 5  honda sample2_1  2.659297 49.68835
 6   audi sample2_1  2.692656 50.31165
 7  honda sample2_2 83.513596 85.63341
 8   audi sample2_2 14.010951 14.36659
 9  honda sample2_3 49.299961 86.53014
10   audi sample2_3  7.674361 13.46986
11  honda sample3_1 22.991566 85.68042
12   audi sample3_1  3.842530 14.31958
13  honda sample3_2 19.679316 83.83256
14   audi sample3_2  3.795233 16.16744

如果需要比率的标准偏差,则可以在同一管道中进行如下计算(使每行的值发生变化):

If you want the standard deviation of the ratios, you can compute it as follows in the same pipe (mutates the value per row):

df %>% gather(sample, value, -id) %>% group_by(sample) %>% mutate(ratio = value / sum(value) * 100, sd_sample = sd(ratio))

如果不希望值重复针对组中的每一行,可以在单独的管道中运行 summarise(sdev = sd(ratio))

If, you don't want values duplicated per row in group, you can run summarise(sdev = sd(ratio)) in a separate pipe.

这篇关于使用dplyr计算行比率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆