用比例(百分比)扩展列联表 [英] Extend contigency table with proportions (percentages)
问题描述
我有一个计数列联表,我想用每个组的相应比例来扩展它.
I have a contingency table of counts, and I want to extend it with corresponding proportions of each group.
一些示例数据(ggplot2
包中的tips
数据集):
Some sample data (tips
data set from ggplot2
package):
library(ggplot2)
head(tips, 3)
# total_bill tip sex smoker day time size
# 1 17 1.0 Female No Sun Dinner 2
# 2 10 1.7 Male No Sun Dinner 3
# 3 21 3.5 Male No Sun Dinner 3
首先,使用table
统计吸烟者和不吸烟者的数量,使用nrow
统计受试者总数:
First, use table
to count smoker vs non-smoker, and nrow
to count total number of subjects:
table(tips$smoker)
# No Yes
# 151 93
nrow(tips)
# [1] 244
然后,我想计算吸烟者与非吸烟者的百分比.像这样(丑陋的代码):
Then, I want to calculate percentage of smokers vs. non smokers. Something like this (ugly code):
# percentage of smokers
options(digits = 2)
transform(as.data.frame(table(tips$smoker)), percentage_column = Freq / nrow(tips) * 100)
# Var1 Freq percentage_column
# 1 No 151 62
# 2 Yes 93 38
有没有更好的方法来做到这一点?
Is there a better way to do this?
(更好的是在一组列(我列举的)上执行此操作并且输出的格式有些好)(例如,吸烟者、日期和时间)
(even better it would be to do this on a set of columns (which I enumerate) and have output somewhat nicely formatted) (e.g., smoker, day, and time)
推荐答案
如果您追求简洁,您可能会喜欢:
If it's conciseness you're after, you might like:
prop.table(table(tips$smoker))
然后根据需要按 100 和四舍五入进行缩放.或者更像您的确切输出:
and then scale by 100 and round if you like. Or more like your exact output:
tbl <- table(tips$smoker)
cbind(tbl,prop.table(tbl))
如果您想对多列执行此操作,您可以根据自己的喜好选择多种不同的方向,看看输出是否干净,但这里有一个选择:
If you wanted to do this for multiple columns, there are lots of different directions you could go depending on what your tastes tell you is clean looking output, but here's one option:
tblFun <- function(x){
tbl <- table(x)
res <- cbind(tbl,round(prop.table(tbl)*100,2))
colnames(res) <- c('Count','Percentage')
res
}
do.call(rbind,lapply(tips[3:6],tblFun))
Count Percentage
Female 87 35.66
Male 157 64.34
No 151 61.89
Yes 93 38.11
Fri 19 7.79
Sat 87 35.66
Sun 76 31.15
Thur 62 25.41
Dinner 176 72.13
Lunch 68 27.87
如果您不喜欢将不同的表格堆叠在一起,您可以放弃 do.call
并将它们留在列表中.
If you don't like stack the different tables on top of each other, you can ditch the do.call
and leave them in a list.
这篇关于用比例(百分比)扩展列联表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!