如何将函数应用于表以将P值作为新行输出 [英] How to apply a function to a table to output P-values as a new row

查看:99
本文介绍了如何将函数应用于表以将P值作为新行输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个简单的数据框。 sum列代表该行的总和。我想使用prop.test来确定每一列的P值,并将该数据显示为标记为p值的另一行。我可以通过以下方式使用prop.test来确定任何单个列的ap值,但无法弄清楚如何使用单个函数将其应用于多个列。

I have this simple dataframe. The sum column represents the sum of the row. I would like to use prop.test to determine the P-value for each column, and present that data as an additional row labeled p-value. I can use prop.test in the following way to determine a p value for any individual column, but cannot work out how to apply that to multiple columns with a single function.

        Other Island N_Shelf N_Shore S_Shore  Sum
Type1    10      4       1       0       3    18
Type2    19     45       1       9      11    85

这将为岛列输出p值

ResI2<- prop.test(x=TableAvE_Island$Island, n=TableAvE_Island$Sum)

输出:

data:  TableAvE_Island$Island out of TableAvE_Island$Sum
X-squared = 4.456, df = 1, p-value = 0.03478
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.56027107 -0.05410802
sample estimates:
   prop 1    prop 2 
0.2222222 0.5294118 

我尝试使用apply命令,但无法确定其用法,并且我能够找到的例子似乎不够相似。

I've tried to use the apply command but cannot work out its usage, and the examples i've been able to find dont seem similar enough. Any pointers would be appreciated.

推荐答案

下面是带有扫帚的外观 s函数 tidy ,该函数获取测试和其他操作的输出并将其格式化为整洁的数据帧。

Here's a look with broom's function tidy, which takes output from tests and other operations and formats them as "tidy" data frames.

对于您发布的第一个 prop.test tidy 输出看起来像这样:

For the first prop.test that you posted, the tidy output looks like this:

library(tidyverse)

broom::tidy(prop.test(TableAvE_Island$Island, TableAvE_Island$Sum))
#>   estimate1 estimate2 statistic    p.value parameter   conf.low
#> 1 0.2222222 0.5294118  4.456017 0.03477849         1 -0.5602711
#>     conf.high
#> 1 -0.05410802
#>                                                                 method
#> 1 2-sample test for equality of proportions with continuity correction
#>   alternative
#> 1   two.sided

要对数据框中的所有变量与Sum进行此操作,我聚集使其变长

To do this for all the variables in your data frame vs Sum, I gathered it into a long shape

table_long <- gather(TableAvE_Island, key = variable, value = val, -Sum)
head(table_long)
#> # A tibble: 6 x 3
#>     Sum variable   val
#>   <int> <chr>    <int>
#> 1    18 Other       10
#> 2    85 Other       19
#> 3    18 Island       4
#> 4    85 Island      45
#> 5    18 N_Shelf      1
#> 6    85 N_Shelf      1

然后按变量将长形数据分组,将其通过管道传送到 do ,它允许您使用在数据框中的每个组上调用一个函数。代表以下内容的子集数据。然后,我在包含 prop.test 嵌套结果的列上调用 tidy 。这样会为您提供测试所有相关结果的数据框,其中显示了 Island, N_Shelf等。

Then grouped the long-shaped data by variable, pipe it into do, which allows you to call a function on each of the groups in a data frame, using . as a standing for the subset of the data. Then I called tidy on the column containing the nested results of the prop.test. This gives you a data frame of all the relevant results of the test, with each of "Island", "N_Shelf", etc shown.

table_long %>%
    group_by(variable) %>%
    do(test = prop.test(x = .$val, n = .$Sum)) %>%
    broom::tidy(test)
#> # A tibble: 5 x 10
#> # Groups:   variable [5]
#>   variable estimate1 estimate2 statistic p.value parameter conf.low
#>   <chr>        <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>
#> 1 Island      0.222     0.529    4.46     0.0348         1  -0.560 
#> 2 N_Shelf     0.0556    0.0118   0.0801   0.777          1  -0.0981
#> 3 N_Shore     0         0.106    0.972    0.324          1  -0.205 
#> 4 Other       0.556     0.224    6.54     0.0106         1   0.0523
#> 5 S_Shore     0.167     0.129    0.00163  0.968          1  -0.183 
#> # ... with 3 more variables: conf.high <dbl>, method <fct>,
#> #   alternative <fct>

reprex软件包(v0.2.0)。

这篇关于如何将函数应用于表以将P值作为新行输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆