使用tbl_summary创建带有标签的摘要统计信息 [英] Using tbl_summary to create summary statistics with labels

查看：70 发布时间：2021/5/2 20:54:42 r dplyr tidyverse gtsummary

本文介绍了使用tbl_summary创建带有标签的摘要统计信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已将Stata(dta)文件读入R，数据片段如下所示:

I have read in a Stata (dta) file into R and a snippet of the data looks like this:

 short
# A tibble: 200 x 5
            q4_1          q4_2          q4_3          q4_4 treatment_cur
       <dbl+lbl>     <dbl+lbl>     <dbl+lbl>     <dbl+lbl> <chr>        
 1 NA(z)         NA(z)         NA(z)         NA(z)         Control      
 2 NA(z)         NA(z)         NA(z)         NA(z)         Control      
 3     1 [1.Yes]     0 [0.No]      0 [0.No]      1 [1.Yes] Treatment    
 4     0 [0.No]      0 [0.No]      1 [1.Yes]     0 [0.No]  Control      
 5     0 [0.No]      0 [0.No]      0 [0.No]      1 [1.Yes] Control      
 6 NA(z)         NA(z)         NA(z)         NA(z)         Control      
 7     1 [1.Yes]     1 [1.Yes]     1 [1.Yes]     1 [1.Yes] Control      
 8 NA(z)         NA(z)         NA(z)         NA(z)         Treatment    
 9 NA(z)         NA(z)         NA(z)         NA(z)         Control      
10     0 [0.No]      0 [0.No]      1 [1.Yes]     0 [0.No]  Control

变量的格式如下:

str(short)
tibble [200 x 5] (S3: tbl_df/tbl/data.frame)
 $ q4_1         : dbl+lbl [1:200] NA(z), NA(z),     1,     0,     0, NA(z),     1, NA(z), NA(z),     0, NA(z),     1, NA(z),     1, NA(z),     1, ...
   ..@ label       : chr "q4_1r.Do you have any of ...assignments? Bilingual/ELL"
   ..@ format.stata: chr "%15.0g"
   ..@ labels      : Named num [1:2] 0 1
   .. ..- attr(*, "names")= chr [1:2] "0.No" "1.Yes"
 $ q4_2         : dbl+lbl [1:200] NA(z), NA(z),     0,     0,     0, NA(z),     1, NA(z), NA(z),     0, NA(z),     0, NA(z),     0, NA(z),     0, ...
   ..@ label       : chr "q4_2r.Do you have any of ...assignments? Sp Ed (self-c)"
   ..@ format.stata: chr "%34.0g"
   ..@ labels      : Named num [1:2] 0 1
   .. ..- attr(*, "names")= chr [1:2] "0.No" "1.Yes"
 $ q4_3         : dbl+lbl [1:200] NA(z), NA(z),     0,     1,     0, NA(z),     1, NA(z), NA(z),     1, NA(z),     1, NA(z),     1, NA(z),     0, ...
   ..@ label       : chr "q4_3r.Do you have any of ...assignments? Sp Ed (incl.)"
   ..@ format.stata: chr "%72.0g"
   ..@ labels      : Named num [1:2] 0 1
   .. ..- attr(*, "names")= chr [1:2] "0.No" "1.Yes"
 $ q4_4         : dbl+lbl [1:200] NA(z), NA(z),     1,     0,     1, NA(z),     1, NA(z), NA(z),     0, NA(z),     1, NA(z),     0, NA(z),     0, ...
   ..@ label       : chr "q4_4r.Do you have any of ...assignments? Gifted/Talented"
   ..@ format.stata: chr "%17.0g"
   ..@ labels      : Named num [1:2] 0 1
   .. ..- attr(*, "names")= chr [1:2] "0.No" "1.Yes"
 $ treatment_cur: chr [1:200] "Control" "Control" "Treatment" "Control" ...
  ..- attr(*, "label")= chr "treatment_cur.treatment_cur"
  ..- attr(*, "format.stata")= chr "%9s"

这是每个变量的类:

> class(short$q4_1)
[1] "haven_labelled" "vctrs_vctr"     "double"

我需要使用来自library(gtsummary)的tbl_summary创建数据的描述性表格，这是一个非常酷的程序包，用于创建数据的快速且可自定义的摘要统计信息.

I need to create descriptive tabulations of the data using tbl_summary from library(gtsummary)--which is a really cool package to create quick and customizable summary stats of the data.

关于我的数据的一件很酷的事情是，每个值都已经有一个与之关联的标签.例如，在q4_2中，0是否".并且1是是".这样，当我将数据输入tbl_summary时，而不是显示在频率计数中:

The cool thing about my data is that each value already has a label associated with it. For example in q4_2, 0 is "No" and 1 is "Yes". So that when I feed the data into tbl_summary, instead of this showing up in the freq count:

q4_1    n
   1    7 
   0    8

这可以显示出来，这就是我想要的:

"q4_1r.Do you have any of ...assignments? Bilingual/ELL"    
        n 
   No   7
   Yes  8

此代码不起作用，因为tbl_summary仅接受某些格式.

This code is not working because tbl_summary only accepts certain formats.

tbl_summary(short)
Column(s) ‘q4_1’, ‘q4_2’, ‘q4_3’, and ‘q4_4’ omitted from output.
Accepted classes are ‘character’, ‘factor’, ‘numeric’, ‘logical’, ‘integer’, or ‘difftime’.

如果我将这些变量转换为字符，它们将丢失其值标签，并且我只会看到以下内容，因为将其转换为字符会使变量失去其标签属性.

If I convert these variables into characters, they lose their value labels, and I only see the following, because converting it to a character makes the variable lose its label attributes.

q4_1    n
   1    7 
   0    8

关于如何解决这个问题有什么想法吗?我找不到具有这种var格式的内置R文件，以使其更具可复制性.

Are there any idea's for how I can work around this? I can't find an inbuilt R file that has this type of var format to make this more reproducible.

使用tbl_summary创建带有标签的摘要统计信息 [英] Using tbl_summary to create summary statistics with labels

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用tbl_summary创建带有标签的摘要统计信息 [英] Using tbl_summary to create summary statistics with labels

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭