如何将函数应用于表以将P值作为新行输出 [英] How to apply a function to a table to output P-values as a new row
问题描述
我有这个简单的数据框。 sum列代表该行的总和。我想使用prop.test来确定每一列的P值,并将该数据显示为标记为p值的另一行。我可以通过以下方式使用prop.test来确定任何单个列的ap值,但无法弄清楚如何使用单个函数将其应用于多个列。
I have this simple dataframe. The sum column represents the sum of the row. I would like to use prop.test to determine the P-value for each column, and present that data as an additional row labeled p-value. I can use prop.test in the following way to determine a p value for any individual column, but cannot work out how to apply that to multiple columns with a single function.
Other Island N_Shelf N_Shore S_Shore Sum
Type1 10 4 1 0 3 18
Type2 19 45 1 9 11 85
这将为岛列输出p值
ResI2<- prop.test(x=TableAvE_Island$Island, n=TableAvE_Island$Sum)
输出:
data: TableAvE_Island$Island out of TableAvE_Island$Sum
X-squared = 4.456, df = 1, p-value = 0.03478
alternative hypothesis: two.sided
95 percent confidence interval:
-0.56027107 -0.05410802
sample estimates:
prop 1 prop 2
0.2222222 0.5294118
我尝试使用apply命令,但无法确定其用法,并且我能够找到的例子似乎不够相似。
I've tried to use the apply command but cannot work out its usage, and the examples i've been able to find dont seem similar enough. Any pointers would be appreciated.
推荐答案
下面是带有扫帚
的外观 s函数 tidy
,该函数获取测试和其他操作的输出并将其格式化为整洁的数据帧。
Here's a look with broom
's function tidy
, which takes output from tests and other operations and formats them as "tidy" data frames.
对于您发布的第一个 prop.test
, tidy
输出看起来像这样:
For the first prop.test
that you posted, the tidy
output looks like this:
library(tidyverse)
broom::tidy(prop.test(TableAvE_Island$Island, TableAvE_Island$Sum))
#> estimate1 estimate2 statistic p.value parameter conf.low
#> 1 0.2222222 0.5294118 4.456017 0.03477849 1 -0.5602711
#> conf.high
#> 1 -0.05410802
#> method
#> 1 2-sample test for equality of proportions with continuity correction
#> alternative
#> 1 two.sided
要对数据框中的所有变量与Sum进行此操作,我聚集
使其变长
To do this for all the variables in your data frame vs Sum, I gather
ed it into a long shape
table_long <- gather(TableAvE_Island, key = variable, value = val, -Sum)
head(table_long)
#> # A tibble: 6 x 3
#> Sum variable val
#> <int> <chr> <int>
#> 1 18 Other 10
#> 2 85 Other 19
#> 3 18 Island 4
#> 4 85 Island 45
#> 5 18 N_Shelf 1
#> 6 85 N_Shelf 1
然后按变量将长形数据分组,将其通过管道传送到 do
,它允许您使用在数据框中的每个组上调用一个函数。
代表以下内容的子集数据。然后,我在包含 prop.test
嵌套结果的列上调用 tidy
。这样会为您提供测试所有相关结果的数据框,其中显示了 Island, N_Shelf等。
Then grouped the long-shaped data by variable, pipe it into do
, which allows you to call a function on each of the groups in a data frame, using .
as a standing for the subset of the data. Then I called tidy
on the column containing the nested results of the prop.test
. This gives you a data frame of all the relevant results of the test, with each of "Island", "N_Shelf", etc shown.
table_long %>%
group_by(variable) %>%
do(test = prop.test(x = .$val, n = .$Sum)) %>%
broom::tidy(test)
#> # A tibble: 5 x 10
#> # Groups: variable [5]
#> variable estimate1 estimate2 statistic p.value parameter conf.low
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Island 0.222 0.529 4.46 0.0348 1 -0.560
#> 2 N_Shelf 0.0556 0.0118 0.0801 0.777 1 -0.0981
#> 3 N_Shore 0 0.106 0.972 0.324 1 -0.205
#> 4 Other 0.556 0.224 6.54 0.0106 1 0.0523
#> 5 S_Shore 0.167 0.129 0.00163 0.968 1 -0.183
#> # ... with 3 more variables: conf.high <dbl>, method <fct>,
#> # alternative <fct>
由 reprex软件包(v0.2.0)。
这篇关于如何将函数应用于表以将P值作为新行输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!