如何构建双向表,汇总R中的第三个变量(有线包) [英] How to build a two-way table summarizing a third variable in R (kable package)

查看:78
本文介绍了如何构建双向表,汇总R中的第三个变量(有线包)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RMarkdown并尝试使用kable软件包.我有一个三变量数据框:性别(因子),年龄组(因子)和test_score(比例).我想创建双向表,将因子变量(性别和age_groups)作为表的行和列,并将test_scores的摘要统计信息作为单元格内容.这些汇总统计信息包括均值,标准差和百分位数(中位数,第一十分位数,第九十分位数和第99个百分位数).有没有一种简单的方法可以以精美的方式(如使用kable包)构建那些表,而无需先将所有这些值输入到矩阵中?我搜索了kable帮助文件,但找不到该怎么做.

I am working with RMarkdown and trying to use kable package. I have a three-variable data frame: gender (factor), age_group (factor), and test_score(scale). I want to create two-way tables with factor-variables (gender and age_groups) as table rows and columns, and summary statistics of test_scores as cell contents. These summary statistics are mean, standard deviation, and percentiles (median, 1st decile, 9th decile, and 99th percentile). Is there an easy way of building those tables in a beautiful way (like with kable package), without needing to input all those values into a matrix first? I searched the kable help file, but could not find how to do it.

# How my data looks like:

gender <- rep(c(rep(c("M", "F"), each=3)), times=3)
age <- as.factor(rep(seq(10,12, 1), each=6))
score <- c(4,6,8,4,8,9,6,6,9,7,10,13,8,9,13,12,14,16)
testdata <-data.frame(gender,age,score)


| gender | age | score |
|--------|-----|-------|
| M      | 10  | 4     |
| M      | 10  | 6     |
| M      | 10  | 8     |
| F      | 10  | 4     |
| F      | 10  | 8     |
| F      | 10  | 9     |
| M      | 11  | 6     |
| M      | 11  | 6     |
| M      | 11  | 9     |
| F      | 11  | 7     |
| F      | 11  | 10    |
| F      | 11  | 13    |
| M      | 12  | 8     |
| M      | 12  | 9     |
| M      | 12  | 13    |
| F      | 12  | 12    |
| F      | 12  | 14    |
| F      | 12  | 16    |

我想要一个看起来像下面的表格(但直接从我的数据集中计算出来,并采用漂亮的发布格式):

I would like a table that looks like below (but calculated directly from my dataset and with a beautiful publishing format):

      Mean score by gender & age
|        | 10yo | 11yo | 12yo | Total |
|--------|:----:|:----:|:----:|:-----:|
| Male   |   6  |   7  |  10  |  7.7  |
| Female |   7  |  10  |  14  |  10.3 |
| Total  |  6.5 | 88.5 |  12  |   9   |

我尝试使用kable软件包,它确实为我提供了一些漂亮的表格(格式很好),但是我只能用它制作频率表.但是我找不到任何参数来选择变量摘要.如果有人建议使用更好的程序包来构建上面指定的表,我将不胜感激.

I tried to use kable package, which indeed provided me some beautiful tables (nicely formatted), but I am only able to produce frequency tables with it. But I cannot find any argument in it to choose for summaries of variables. If anyone has a suggestion of a better package to build a table like above specified, I would appreciate it a lot.

kable(data, "latex", booktabs = T) %>%
   kable_styling(latex_options = "striped")

推荐答案

如果没有可重复的示例,则可以使用 tables :: tabular()函数创建包含多种统计信息的多向表..

Absent a reproducible example, multi-way tables including a variety of statistics can be created with the tables::tabular() function.

这是 tables 文档(第38页)中的示例,该示例说明了表格中的多个变量,该变量打印出均值和标准差.

Here is an example from the tables documentation, page 38 that illustrates multiple variables in a table that prints means and standard deviations.

set.seed(1206)

q <- data.frame(p = rep(c("A","B"),each = 10,len = 30), 
                a = rep(c(1,2,3),each = 10),
                id = seq(30),
                b = round(runif(30,10,20)),
                c = round(runif(30,40,70)))
library(tables)
tab <- tabular((Factor(p)*Factor(a)+1) ~ (N = 1) + (b + c) * (mean + sd),
               data = q)
tab[ tab[,1] > 0, ]

输出的Stackoverflow友好版本为:

A Stackoverflow friendly version of the output is:

          b           c          
 p a   N  mean  sd    mean  sd   
 A 1   10 14.40 3.026 55.70 6.447
   3   10 14.50 2.877 52.80 8.954
 B 2   10 14.40 2.836 56.30 7.889
   All 30 14.43 2.812 54.93 7.714
>

可以使用 html()函数将表呈现为HTML.以下代码的输出在HTML浏览器中呈现时,如下图所示.

One can render the table to HTML with the html() function. The output from the following code, when rendered in an HTML browser looks like the following illustration.

html(tab[ tab[,1] > 0, ])

包括计算其他统计信息(包括分位数)的功能.有关分位数计算的详细信息,请参见表格包的第29-30页手册.

tables includes capabilities to calculate other statistics, including quantiles. For details on quantile calculations, see pp. 29 - 30 of the tables package manual.

该软件包还可以与 knitr kable kableExtra 一起使用.

The package also works with knitr, kable, and kableExtra.

这篇关于如何构建双向表,汇总R中的第三个变量(有线包)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆