是否可以使用R在分类树分析中获得节点的p值? [英] Is it possible to get a p-value for nodes in a categorical tree analysis with R?

查看:36
本文介绍了是否可以使用R在分类树分析中获得节点的p值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用R的分类树分析中是否可以为节点获取p值?我正在使用rpart,无法为每个节点找到p值.也许这只能通过回归而不是类别来实现.

Is it possible to get a p-value for nodes in a categorical tree analysis with R? I am using rpart and can't locate a p-value for each node. Maybe this is only possible with a regression and not categories.

structure(list(subj = c(702L, 702L, 702L, 702L, 702L, 702L, 702L, 
702L, 702L, 702L, 702L, 702L, 702L, 702L, 702L, 702L, 702L, 702L, 
702L, 702L, 702L, 702L, 702L, 702L), visit = c(4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L), run = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 
4L), .Label = c("A", "B", "C", "D", "E", "xdur", "xend60", "xpre"
), class = "factor"), ho = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), hph = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), longexer = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("10min", "60min"), class = "factor"), 
    esq_sick = c(NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, 
    NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA), esq_sick2 = c(NA, 
    NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, 
    NA, NA, 0L, NA, NA, NA, NA, NA), ll_sick = c(NA, NA, 0L, 
    NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 
    0L, NA, NA, NA, NA, NA), ll_sick2 = c(NA, NA, 0L, NA, NA, 
    NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, 
    NA, NA, NA, NA), esq_01 = c(NA, NA, 2L, NA, NA, NA, NA, NA, 
    NA, NA, 2L, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, 
    NA), esq_02 = c(NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, 2L, 
    NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA), esq_03 = c(NA, 
    NA, 0L, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, 
    NA, NA, 0L, NA, NA, NA, NA, NA), esq_04 = c(NA, NA, 0L, NA, 
    NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, 
    NA, NA, NA, NA, NA), esq_05 = c(NA, NA, 0L, NA, NA, NA, NA, 
    NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, 
    NA, NA), esq_06 = c(NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, 
    1L, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA), 
    esq_07 = c(NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, 
    NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA), esq_08 = c(NA, 
    NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, 
    NA, NA, 0L, NA, NA, NA, NA, NA), esq_09 = c(NA, NA, 0L, NA, 
    NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, 
    NA, NA, NA, NA, NA), esq_10 = c(NA, NA, 0L, NA, NA, NA, NA, 
    NA, NA, NA, 0L, NA, NA, NA, NA, NA, NA, NA, 0L, NA, NA, NA, 
    NA, NA)), .Names = c("subj", "visit", "run", "ho", "hph", 
"longexer", "esq_sick", "esq_sick2", "ll_sick", "ll_sick2", "esq_01", 
"esq_02", "esq_03", "esq_04", "esq_05", "esq_06", "esq_07", "esq_08", 
"esq_09", "esq_10"), row.names = 7:30, class = "data.frame")



alldata = read.table('symptomology CSV2.csv',header=TRUE,sep=",")

library(rpart)

fit <- rpart(esq_sick2~esq_01_bin + esq_02_bin + esq_03_bin + esq_04_bin + esq_05_bin + esq_06_bin + esq_07_bin + esq_08_bin + esq_09_bin + esq_10_bin + esq_11_bin + esq_12_bin + esq_13_bin + esq_14_bin + esq_15_bin + esq_16_bin + esq_17_bin + esq_18_bin + esq_19_bin + esq_20_bin, method="class", data=alldata)

plot(fit, uniform = FALSE, branch = 1, compress = FALSE, nspace, margin = 0.1, minbranch = 0.3)
text(fit, use.n=TRUE, all=TRUE, cex=.8)

推荐答案

以下示例可能会为您提供帮助.我正在使用内置的airquality数据集和ctree帮助中提供的示例:

Here's an example that might help you. I'm using the built-in airquality data set and the example provided in the help for ctree:

library(partykit)
# For the sctest function to extract p-values (see help for ctree and sctest)
library(strucchange)

# Data we'll use
airq <- subset(airquality, !is.na(Ozone))

# Build the tree
airct <- ctree(Ozone ~ ., data = airq)

看树:

airct
Model formula:
Ozone ~ Solar.R + Wind + Temp + Month + Day

Fitted party:
[1] root
|   [2] Temp <= 82
|   |   [3] Wind <= 6.9: 55.600 (n = 10, err = 21946.4)
|   |   [4] Wind > 6.9
|   |   |   [5] Temp <= 77: 18.479 (n = 48, err = 3956.0)
|   |   |   [6] Temp > 77: 31.143 (n = 21, err = 4620.6)
|   [7] Temp > 82
|   |   [8] Wind <= 10.3: 81.633 (n = 30, err = 15119.0)
|   |   [9] Wind > 10.3: 48.714 (n = 7, err = 1183.4)

提取p值:

sctest(airct)

$`1`
              Solar.R         Wind         Temp     Month        Day
statistic 13.34761286 4.161370e+01 5.608632e+01 3.1126596 0.02011554
p.value    0.00129309 5.560572e-10 3.468337e-13 0.3325881 0.99998175

$`2`
            Solar.R         Wind         Temp     Month      Day
statistic 5.4095322 12.968549828 11.298951405 0.2148961 2.970294
p.value   0.0962041  0.001582833  0.003871534 0.9941976 0.357956

$`3`
NULL

$`4`
              Solar.R     Wind         Temp      Month       Day
statistic 9.547191843 2.307676 11.598966936 0.06604893 0.2513143
p.value   0.009972755 0.497949  0.003295072 0.99965679 0.9916670

$`5`
             Solar.R      Wind      Temp     Month       Day
statistic 6.14094026 1.3865355 1.9986304 0.8268341 1.3580462
p.value   0.06432172 0.7447599 0.5753799 0.8952749 0.7528481

$`6`
            Solar.R       Wind      Temp    Month       Day
statistic 5.1824354 0.02060939 0.9270013 0.165171 4.6220522
p.value   0.1089932 0.99998062 0.8705785 0.996871 0.1481643

$`7`
            Solar.R         Wind       Temp     Month        Day
statistic 0.8083249 11.711564549 6.77148538 0.1307643 0.03992875
p.value   0.8996614  0.003101788 0.04546281 0.9982052 0.99990034

$`8`
            Solar.R      Wind      Temp       Month         Day
statistic 0.9056479 3.1585094 2.9285252 0.008106707 0.008686293
p.value   0.8759687 0.3247585 0.3657072 0.999998099 0.999997742

$`9`
NULL

这篇关于是否可以使用R在分类树分析中获得节点的p值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆