计算数据帧子集中的比例 [英] Calculate proportions within subsets of a data frame

查看:67
本文介绍了计算数据帧子集中的比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取数据帧子集中的比例.例如,在这个伪造的数据框中:

I am trying to obtain proportions within subsets of a data frame. For example, in this made-up data frame:

DF<-data.frame(category1=rep(c("A","B"),each=9),
    category2=rep(rep(LETTERS[24:26],each=3),2),
     animal=rep(c("dog","cat","mouse"),6),number=sample(18))

我想通过category2组合来计算每只category1的三只动物的比例(例如,在所有既是"A"又是"X"的动物中,狗是什么比例? ).使用数据框第4列上的prop.table,我可以得到每一行占总数字"列的比例,但是我还没有找到针对基于类别1和2的子集执行此操作的方法.尝试使用以下方法按category1category2拆分数据:

I would like like to calculate the proportion of each of the three animals for each category1 by category2 combination (e.g., out of all animals that are both "A" and "X", what proportion are dogs?). With prop.table on column 4 of the data frame I can get the proportion that each row makes up of the total "number" column, but I have not found a way to do this for subsets based on category 1 and 2. I also tried splitting the data by category1 and category2 using this:

splitDF<-split(DF,list(DF$category1,DF$category2))

我希望然后可以用prop.table应用一个函数来获取每个拆分组中每只动物的比例,但是我无法使prop.table正常工作,因为我似乎无法指定要在哪一列数据中进行操作.将功能应用于拆分组中.有人有提示吗?也许可以使用plyr或类似的东西吗?我在帮助论坛中找不到有关在 数据子集中获取比例的方法的任何信息.

And I was hoping I could then apply a function with prop.table to get the proportions of each animal within each split group, but I cannot get prop.table working because I can't seem to specify which column of data to apply the function to within the split groups. Does anyone have any tips? Maybe this is possible with plyr or something similar? I can't find anything in the help forums about ways to get proportions within subsets of data.

推荐答案

您可以使用库plyr中的函数ddply()计算每种组合的比例,然后将新列添加到数据框中.

You can use function ddply() from library plyr to calculate proportions for each combination and then add new column to data frame.

 library(plyr)     
 DF<-ddply(DF,.(category1,category2),transform,prop=number/sum(number))
 DF
   category1 category2 animal number       prop
1          A         X    dog     17 0.44736842
2          A         X    cat      3 0.07894737
3          A         X  mouse     18 0.47368421
4          A         Y    dog      2 0.14285714

这篇关于计算数据帧子集中的比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆