使用dplyr按组计算比率 [英] Calculating ratios by group with dplyr

查看:74
本文介绍了使用dplyr按组计算比率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用以下数据框,我想按重复和分组对数据进行分组,然后计算治疗值与对照值的比率.

Using the following dataframe I would like to group the data by replicate and group and then calculate a ratio of treatment values to control values.

structure(list(group = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("case", "controls"), class = "factor"), treatment = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "EPA", class = "factor"), 
    replicate = structure(c(2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L), .Label = c("four", 
    "one", "three", "two"), class = "factor"), fatty_acid_family = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "saturated", class = "factor"), 
    fatty_acid = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "14:0", class = "factor"), 
    quant = c(6.16, 6.415, 4.02, 4.05, 4.62, 4.435, 3.755, 3.755
    )), .Names = c("group", "treatment", "replicate", "fatty_acid_family", 
"fatty_acid", "quant"), class = "data.frame", row.names = c(NA, 
-8L))

我尝试如下使用dplyr:

I have tried using dplyr as follows:

group_by(dataIn, replicate, group) %>% transmute(ratio = quant[group=="case"]/quant[group=="controls"])

但这会导致错误:大小(%d)不兼容,预期为%d(组大小)或1

最初,我认为这可能是因为我试图从8行深度的df中创建4个比率,所以我认为 summary 可能是答案(将每个组压缩为一个比率),但这并没有也不起作用(我的理解是一个缺点).

Initially I thought this might be because I was trying to create 4 ratios from a df 8 rows deep and so I thought summarise might be the answer (collapsing each group to one ratio) but that doesn't work either (my understanding is a shortcoming).

group_by(dataIn, replicate, group) %>% summarise(ratio = quant[group=="case"]/quant[group=="controls"])

  replicate    group ratio
1      four     case    NA
2      four controls    NA
3       one     case    NA
4       one controls    NA
5     three     case    NA
6     three controls    NA
7       two     case    NA
8       two controls    NA

即使我可以通过 dplyr 来解决问题,我也希望能为我提供一些建议.

I would appreciate some advice on where I'm going wrong or even if this can be done with dplyr.

谢谢.

推荐答案

您可以尝试:

group_by(dataIn, replicate) %>% 
    summarise(ratio = quant[group=="case"]/quant[group=="controls"])
#Source: local data frame [4 x 2]
#
#  replicate    ratio
#1      four 1.078562
#2       one 1.333333
#3     three 1.070573
#4       two 1.446449

由于您按复制和分组进行分组,因此无法同时访问来自不同组的数据.

Because you grouped by replicate and group, you could not access data from different groups at the same time.

这篇关于使用dplyr按组计算比率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆