用比例(百分比)扩展列联表 [英] Extend contigency table with proportions (percentages)

查看:39
本文介绍了用比例(百分比)扩展列联表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个计数列联表,我想用每个组的相应比例来扩展它.

I have a contingency table of counts, and I want to extend it with corresponding proportions of each group.

一些示例数据(ggplot2 包中的tips 数据集):

Some sample data (tips data set from ggplot2 package):

library(ggplot2)

head(tips, 3)
#   total_bill tip    sex smoker day   time size
# 1         17 1.0 Female     No Sun Dinner    2
# 2         10 1.7   Male     No Sun Dinner    3
# 3         21 3.5   Male     No Sun Dinner    3

首先,使用table统计吸烟者和不吸烟者的数量,使用nrow统计受试者总数:

First, use table to count smoker vs non-smoker, and nrow to count total number of subjects:

table(tips$smoker)
#  No Yes 
# 151  93 

nrow(tips)
# [1] 244

然后,我想计算吸烟者与非吸烟者的百分比.像这样(丑陋的代码):

Then, I want to calculate percentage of smokers vs. non smokers. Something like this (ugly code):

# percentage of smokers
options(digits = 2)

transform(as.data.frame(table(tips$smoker)), percentage_column = Freq / nrow(tips) * 100)
#   Var1 Freq percentage_column
# 1   No  151                62
# 2  Yes   93                38

有没有更好的方法来做到这一点?

Is there a better way to do this?

(更好的是在一组列(我列举的)上执行此操作并且输出的格式有些好)(例如,吸烟者、日期和时间)

(even better it would be to do this on a set of columns (which I enumerate) and have output somewhat nicely formatted) (e.g., smoker, day, and time)

推荐答案

如果您追求简洁,您可能会喜欢:

If it's conciseness you're after, you might like:

prop.table(table(tips$smoker))

然后根据需要按 100 和四舍五入进行缩放.或者更像您的确切输出:

and then scale by 100 and round if you like. Or more like your exact output:

tbl <- table(tips$smoker)
cbind(tbl,prop.table(tbl))

如果您想对多列执行此操作,您可以根据自己的喜好选择多种不同的方向,看看输出是否干净,但这里有一个选择:

If you wanted to do this for multiple columns, there are lots of different directions you could go depending on what your tastes tell you is clean looking output, but here's one option:

tblFun <- function(x){
    tbl <- table(x)
    res <- cbind(tbl,round(prop.table(tbl)*100,2))
    colnames(res) <- c('Count','Percentage')
    res
}

do.call(rbind,lapply(tips[3:6],tblFun))
       Count Percentage
Female    87      35.66
Male     157      64.34
No       151      61.89
Yes       93      38.11
Fri       19       7.79
Sat       87      35.66
Sun       76      31.15
Thur      62      25.41
Dinner   176      72.13
Lunch     68      27.87

如果您不喜欢将不同的表格堆叠在一起,您可以放弃 do.call 并将它们留在列表中.

If you don't like stack the different tables on top of each other, you can ditch the do.call and leave them in a list.

这篇关于用比例(百分比)扩展列联表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆