partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性 [英] partykit: justify text in terminal node when unequal regressors' name lengths are included

查看：44 发布时间：2021/4/29 20:38:30 r tree decision-tree party

本文介绍了partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将终端节点的美学编辑为:

增大框的大小，以便在其中列出全名.
如果可能，在存在不相等的回归者名称长度的情况下，对内部文本进行对齐，以生成终端节点的表状视图.

在下面，我使用 gp 选项(fontsize = 10，boxwidth = 10)列出了我的尝试，但我怀疑自己使用的是错误的美学选项./p>

mysummary 函数在

但是我想得到类似以下的内容:

非常感谢.

解决方案

一个简单且基本的解决方案是使用诸如Courier或Inconsolata之类的比例宽度字体:

  plot(pid_tree，terminal_panel = node_terminal，tp_args = list(FUN = mysummary，填充=白色")，gp = gpar(fontfamily ="inconsolata"))

除了这个简单的基于文本的表之外，您还可以生成更复杂的表，例如，通过 ggplot2 和 gtable 生成，如下图所示:Seibold，霍特霍恩，Zeileis(2019).具有全局加性效应的广义线性模型树".数据分析和分类的进展， 13 ，703-725.

涉及到一些代码，但是可以在本文的复制材料中找到.具体来说，您需要以下两个文件:

this question.


library("partykit")

set.seed(1234L)
data("PimaIndiansDiabetes", package = "mlbench")
## a simple basic fitting function (of type 1) for a logistic regression
logit <- function(y, x, start = NULL, weights = NULL, offset = NULL, ...) {
                  glm(y ~ 0 + x, family = binomial, start = start, ...)}


## Long name regressors
PimaIndiansDiabetes$looooong_name_1 <- rnorm(nrow(PimaIndiansDiabetes))
PimaIndiansDiabetes$looooong_name_2 <- rnorm(nrow(PimaIndiansDiabetes))
## Short name regressor
PimaIndiansDiabetes$short_name <- rnorm(nrow(PimaIndiansDiabetes))


## set up a logistic regression tree
pid_tree <- mob(diabetes ~ glucose        + 
                          looooong_name_1 +
                          looooong_name_2 +
                          short_name      | 
                          pregnant + pressure + triceps + insulin +
                          mass + pedigree + age, data = PimaIndiansDiabetes, fit = logit)

## Summary function from: https://stackoverflow.com/questions/65495322/partykit-modify-terminal-node-to-include-standard-deviation-and-significance-of/65500344#65500344
mysummary <- function(info, digits = 2) {
  n <- info$nobs
  na <- format(names(coef(info$object)))
  cf <- format(coef(info$object), digits = digits)
  se <- format(sqrt(diag(vcov(info$object))), digits = digits)
  t <- format(coef(info$object)/sqrt(diag(vcov(info$object))) ,digits = digits)

  c(paste("n =", n),
    paste("Regressor","beta" ,"[", "t-ratio" ,"]"),
    paste(na, cf, "[",t,"]")
  )
}

#plot tree
plot(pid_tree,
     terminal_panel = node_terminal,
     tp_args = list(FUN = mysummary,fill = c("white")),
     gp = gpar(fontsize = 10,
               boxwidth = 10,           ## aparently this option doesn't belonw here,
               margins = rep(0.01, 4))) ## neither this does.

This is what I am getting:

but I would like to get something like the following:

Thanks a lot.

解决方案

A simple and basic solution is to use a proportional width font like Courier or Inconsolata:

plot(pid_tree, terminal_panel = node_terminal,
  tp_args = list(FUN = mysummary, fill = "white"),
  gp = gpar(fontfamily = "inconsolata"))

In addition to this simple text-based table, you can also produce more elaborate tables, e.g., via ggplot2 and gtable as in the following plot taken from: Seibold, Hothorn, Zeileis (2019). "Generalised Linear Model Trees with Global Additive Effects." Advances in Data Analysis and Classification, 13, 703-725. doi:10.1007/s11634-018-0342-1

The code is a little bit involved but available in the replication materials of the article. Specifically, you need these two files:

这篇关于partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性 [英] partykit: justify text in terminal node when unequal regressors' name lengths are included

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性 [英] partykit: justify text in terminal node when unequal regressors&#39; name lengths are included

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

partykit:当包含不相等的回归者的名称长度时，在终端节点中证明文本的正确性 [英] partykit: justify text in terminal node when unequal regressors' name lengths are included

登录关闭