Tidy chisq.test输出的功能,用于可视化或过滤P值 [英] Function for Tidy chisq.test Output for Visualizing or Filtering P-Values

查看:83
本文介绍了Tidy chisq.test输出的功能,用于可视化或过滤P值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于数据...

library(productplots) 
library(ggmosaic)

对于代码...

 library(tidyverse)
 library(broom)

我正在尝试创建整洁的chisq.test输出,以便我可以轻松过滤或可视化p值。

I'm trying to create tidy chisq.test output so that I can easily filter or visualize p-values.

我使用的是 happy数据集(上面列出的两个软件包均随附)

I'm using the "happy" dataset (which is included with either of the packages listed above)

在此示例中,如果我想将 happy变量作为所有其他变量的条件,则将隔离类别变量(在此示例中,我不会创建年龄,年份等之外的因素分组),并且然后运行一个简单的函数。

For this example, if I wanted to condition the "happy" variable on all other variables,I would isolate the categorical variables (I'm not going to create factor groupings out of age, year, etc, for this example), and then run a simple function.

df<-happy%>%select(-year,-age,-wtssall)
lapply(df,function(x)chisq.test(happy$happy,x)

但是,我想要一个

我尝试了各种与代码相似的组合,以使p值的数据框能够进行过滤或可视化。希望进一步将其整理到整洁的扫帚功能或过滤器中,在这里我可以缩小有效的p值,或者通过管道将p值或chi统计信息绘制成ggplot条形图。

I've tried various combinations similar to the code below with the hopes of further piping into "tidy" broom functions or into "filter" where I can narrow in on the significant p-values, or pipe into a ggplot bar chart of p-values or chi statistics.

df%>%summarise_if(is.factor,funs(chisq.test(.,df$happy)$p.value))

...但是输出似乎不正确,如果我对变量分别运行invidivual chisq.test ,答案是不同的。

...but the output doesn't seem correct. If I run invidivual chisq.test separately against the variables, the answers are different.

因此,有没有一种方法可以轻松比较分类变量(在这种情况下,与其他所有列比较满意),并返回整洁的数据框为了进一步的摩尼进行分析?

So, is there a way to easily compare categorical variables, in this case "happy" against all the other columns, and return a tidy dataframe for further manipulation and analysis?

使用dplyr :: mutate,tidyr :: nest和purrr :: map的Purrr解决方案会很棒,但是我感觉嵌套列表列方法不会与chisq.test一起使用。

A Purrr solution using dplyr::mutate, tidyr::nest, and purrr::map would be great, but I have a feeling the nested list column method wouldn't work with chisq.test.

推荐答案

您可以在 tidyverse 工作流程中使用 map 代替 lapply 。除非您要细分数据以某种方式(例如年龄段)比较结果,否则就不需要

You can do this all within the tidyverse workflow, using map in place of lapply. There's no need for nest unless you're going to be subsetting the data to compare the results in some fashion (e.g an age group)

df <- happy%>%
  select(-id, -year,-age,-wtssall) %>% 
  map(~chisq.test(.x, happy$happy)) %>% 
  tibble(names = names(.), data = .) %>% 
  mutate(stats = map(data, tidy))

unnest(df, stats)

# A tibble: 6 × 6
    names        data   statistic       p.value parameter                     method
    <chr>      <list>       <dbl>         <dbl>     <int>                     <fctr>
1   happy <S3: htest> 92606.00000  0.000000e+00         4 Pearson's Chi-squared test
2     sex <S3: htest>    11.46604  3.237288e-03         2 Pearson's Chi-squared test
3 marital <S3: htest>  2695.18474  0.000000e+00         8 Pearson's Chi-squared test
4  degree <S3: htest>   659.33013 4.057952e-137         8 Pearson's Chi-squared test
5 finrela <S3: htest>  2374.24165  0.000000e+00         8 Pearson's Chi-squared test
6  health <S3: htest>  2928.62829  0.000000e+00         6 Pearson's Chi-squared test

这篇关于Tidy chisq.test输出的功能,用于可视化或过滤P值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆