计算数据帧中的列的百分比 - “分组”按列 [英] Calculate the percentages of a column in a data frame - "grouped" by column
问题描述
我是一名R初学者,来到这一点,我需要计算数据框中的值的百分比,但是经常会以另一个列值分组。
I am an R beginner and came to the point, where I need the possibility to calculate percentages of values in a data frame, but "grouped" by an other column value, quite often.
我有一个包含mediatype,version,collection(= year)和count(今年)的大约1000行的数据框。我可以过滤它们,只得到一个特定的媒体:
I have a data frame with around 1000 rows, containing mediatype, version, collection (= year) and count (for this year). I can filter them, to get only a specific mediatye:
trSpdf <- trS[trS$Mediatype == 'application/pdf',]
并获得以下示例性输出:
and get the following exemplary output:
> trSpdf
Mediatype Version Collection Count
39 application/pdf -1 co2008 2.0
40 application/pdf -1 co2009 5.0
43 application/pdf 1 co2008 1.0
44 application/pdf 1 co2009 1.0
48 application/pdf 1.1 co2008 16.0
52 application/pdf 1.2 co2008 20.0
53 application/pdf 1.2 co2009 90.0
... (continuing) ...
我想要的是计算每个集合的每个版本的百分比=年)与本集合中的所有版本相比,因此在此示例中,结果应为:
What I want, is to calculate the percentage of each version for each collection (= year) compared to all versions in this collection, so for this example the result should be:
5.12% of all versions in co2008 were version -1 (2.0 / total sum for co2008)
2.56% of all versions in co2008 were version 1 (1.0 / total sum for co2008)
...
93,75% of all versions in co2009 were version 1.2 (90.0 / total sum for co2009)
...
提前感谢我如何解决这个问题的任何答案。
Thanks in advance for any answers on how I could solve this.
推荐答案
首先,使用 ave
添加一列,列出每个 Mediatype
和 Collection
:
First, use ave
to add a column giving the total count per Mediatype
and Collection
:
trS <- transform(trS, Tot.Count = ave(Count, Mediatype, Collection, FUN = sum))
然后,很明显如何计算百分比:
Then, it is easy pretty obvious how to compute the percentage:
trS <- transform(trS, percentage = 100 * Count/Tot.Count)
或者如果你想要它很好地格式化(例如5.13%)然后使用 sprintf
:
Or if you want it nicely formatted (e.g. "5.13%") then use sprintf
:
trS <- transform(trS, percentage = paste0(sprintf("%.2f", 100 * Count/Tot.Count),
"%"))
这篇关于计算数据帧中的列的百分比 - “分组”按列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!