用"print"显示推断树节点值. [英] Displaying inference tree node values with "print"

查看:101
本文介绍了用"print"显示推断树节点值.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我问这个问题,我会道歉,因为我是R和统计分析的新手.

I apologize in advance if I butcher this question as I'm very new to R and statistical analysis in general.

我已经使用party库生成了条件推理树.
当我plot(my_tree, type = "simple")时,我得到的结果是这样的:

I've generated a conditional inference tree using the party library.
When I plot(my_tree, type = "simple") I get a result like this:

当我print(my_tree)时,我得到的结果是这样的:

When I print(my_tree) I get a result like this:

1) SOME_VALUE <= 2.5; criterion = 1, statistic = 1306.478
  2) SOME_VALUE <= -10.5; criterion = 1, statistic = 173.416
    3) SOME_VALUE <= -16; criterion = 1, statistic = 19.385
      4)*  weights = 275 
    3) SOME_VALUE > -16
      5)*  weights = 261 
  2) SOME_VALUE > -10.5
    6) SOME_VALUE <= -2.5; criterion = 1, statistic = 24.094
      7) SOME_VALUE <= -6.5; criterion = 0.974, statistic = 4.989
        8)*  weights = 346 
      7) SOME_VALUE > -6.5
        9)*  weights = 563 
    6) SOME_VALUE > -2.5
      10)*  weights = 442 
1) SOME_VALUE > 2.5
  11) SOME_VALUE <= 10; criterion = 1, statistic = 225.148
    12) SOME_VALUE <= 6.5; criterion = 1, statistic = 18.789
      13)*  weights = 648 
    12) SOME_VALUE > 6.5
      14)*  weights = 473 
  11) SOME_VALUE > 10
    15) SOME_VALUE <= 16; criterion = 1, statistic = 51.729
      16)*  weights = 595 
    15) SOME_VALUE > 16
      17) SOME_VALUE <= 23.5; criterion = 0.997, statistic = 8.931
        18)*  weights = 488 
      17) SOME_VALUE > 23.5
        19)*  weights = 365 

我更喜欢print的输出,但似乎缺少y = (0.96, 0.04)值.

I prefer the output of print, but it seems to be lacking the y = (0.96, 0.04) values.

理想情况下,我希望输出看起来像这样:

Ideally, I would like my output to look something like this:

1) SOME_VALUE <= 2.5; criterion = 1, statistic = 1306.478
  2) SOME_VALUE <= -10.5; criterion = 1, statistic = 173.416
    3) SOME_VALUE <= -16; criterion = 1, statistic = 19.385
      4)*  weights = 275; y = (0.96, 0.04)
    3) SOME_VALUE > -16
      5)*  weights = 261; y = (0.831, 0.169)
  2) SOME_VALUE > -10.5
...

我该如何做到这一点?

推荐答案

可以使用partykit程序包(party的后继程序)进行此操作,但即使在那儿也需要进行一些修改.原则上,print()函数可以使用内部和终端节点等面板函数进行自定义.但是,即使对于像这样的看似简单的任务,它们也不是很好用.

It is possible to do this with the partykit package (the successor to party) but even there it requires some hacking. In principle, the print() function is customizable with panel functions for inner and terminal nodes etc. But they do not look very nice even for seemingly simple tasks like this one.

由于您似乎使用了具有双变量响应的树,因此让我们考虑这个简单(尽管意义不大)的可重现示例:

As you appear to have used a tree with a bivariate response, let's consider this simple (albeit not very meaningful) reproducible example:

library("partykit")
airq <- subset(airquality, !is.na(Ozone))
ct <- ctree(Ozone + Wind ~ ., data = airq)

对于内部节点,假设我们只想显示每个节点的$info中容易获得的p值.我们可以通过以下方式对此进行格式化:

For the inner nodes let's assume we just want to show the p-value that is readily available in the $info of each node. We can format this via:

ip <- function(node) formatinfo_node(node,
  prefix = " ",
  FUN = function(info) paste0("[p = ", format.pval(info$p.value), "]")
)

对于终端节点,我们希望显示观察次数(假设未使用weights)和平均响应.两者都在小表中预先计算,然后通过每个节点的$id进行访问:

For the terminal nodes we want to show the number of observations (assuming no weights have been used) and the mean response. Both are pre-computed in small tables and then accessed via the $id of each node:

n <- table(ct$fitted[["(fitted)"]])
m <- aggregate(ct$fitted[["(response)"]], list(ct$fitted[["(fitted)"]]), mean)
m <- apply(m[, -1], 1, function(x) paste(round(x, digits = 3), collapse = ", "))
names(m) <- names(n)

然后通过以下方式定义面板功能:

The panel function is then defined by:

tp <- function(node) formatinfo_node(node,
  prefix = ": ",
  FUN = function(info) paste0(
    "n = ", n[as.character(node$id)],
    ", y = (", m[as.character(node$id)], ")"
  )
)

要在print()方法中应用此方法,我们需要直接调用print.party(),因为当前print.constparty()不能正确传递此参数. (我们必须在partykit包中对此进行修复.)

To apply this in the print() method we need to call print.party() directly because currently print.constparty() does not pass this on correctly. (We will have to fix this in the partykit package.)

print.party(ct, inner_panel = ip, terminal_panel = tp)
## [1] root
## |   [2] Temp <= 82 [p = 0.0044842]
## |   |   [3] Temp <= 77: n = 52, y = (18.615, 11.562)
## |   |   [4] Temp > 77: n = 27, y = (41.815, 9.737)
## |   [5] Temp > 82: n = 37, y = (75.405, 7.565)

希望这接近您想要做的事情,应该为您提供一个模板以供进一步修改.

This is hopefully close to what you wanted to do and should give you a template for further modifications.

这篇关于用"print"显示推断树节点值.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆