partykit:当包含不相等的回归者的名称长度时,在终端节点中证明文本的正确性 [英] partykit: justify text in terminal node when unequal regressors' name lengths are included
问题描述
我正在尝试将终端节点的美学编辑为:
-
增大框的大小,以便在其中列出全名.
-
如果可能,在存在不相等的回归者名称长度的情况下,对内部文本进行对齐,以生成终端节点的表状视图.
在下面,我使用 gp
选项(fontsize = 10,boxwidth = 10)
列出了我的尝试,但我怀疑自己使用的是错误的美学选项./p>
mysummary
函数在
但是我想得到类似以下的内容:
非常感谢.
一个简单且基本的解决方案是使用诸如Courier或Inconsolata之类的比例宽度字体:
plot(pid_tree,terminal_panel = node_terminal,tp_args = list(FUN = mysummary,填充=白色"),gp = gpar(fontfamily ="inconsolata"))
除了这个简单的基于文本的表之外,您还可以生成更复杂的表,例如,通过 ggplot2
和 gtable
生成,如下图所示:Seibold,霍特霍恩,Zeileis(2019).具有全局加性效应的广义线性模型树".数据分析和分类的进展, 13 ,703-725.
涉及到一些代码,但是可以在本文的复制材料中找到.具体来说,您需要以下两个文件:
-
library("partykit") set.seed(1234L) data("PimaIndiansDiabetes", package = "mlbench") ## a simple basic fitting function (of type 1) for a logistic regression logit <- function(y, x, start = NULL, weights = NULL, offset = NULL, ...) { glm(y ~ 0 + x, family = binomial, start = start, ...)} ## Long name regressors PimaIndiansDiabetes$looooong_name_1 <- rnorm(nrow(PimaIndiansDiabetes)) PimaIndiansDiabetes$looooong_name_2 <- rnorm(nrow(PimaIndiansDiabetes)) ## Short name regressor PimaIndiansDiabetes$short_name <- rnorm(nrow(PimaIndiansDiabetes)) ## set up a logistic regression tree pid_tree <- mob(diabetes ~ glucose + looooong_name_1 + looooong_name_2 + short_name | pregnant + pressure + triceps + insulin + mass + pedigree + age, data = PimaIndiansDiabetes, fit = logit) ## Summary function from: https://stackoverflow.com/questions/65495322/partykit-modify-terminal-node-to-include-standard-deviation-and-significance-of/65500344#65500344 mysummary <- function(info, digits = 2) { n <- info$nobs na <- format(names(coef(info$object))) cf <- format(coef(info$object), digits = digits) se <- format(sqrt(diag(vcov(info$object))), digits = digits) t <- format(coef(info$object)/sqrt(diag(vcov(info$object))) ,digits = digits) c(paste("n =", n), paste("Regressor","beta" ,"[", "t-ratio" ,"]"), paste(na, cf, "[",t,"]") ) } #plot tree plot(pid_tree, terminal_panel = node_terminal, tp_args = list(FUN = mysummary,fill = c("white")), gp = gpar(fontsize = 10, boxwidth = 10, ## aparently this option doesn't belonw here, margins = rep(0.01, 4))) ## neither this does.
This is what I am getting:
but I would like to get something like the following:
Thanks a lot.
解决方案A simple and basic solution is to use a proportional width font like Courier or Inconsolata:
plot(pid_tree, terminal_panel = node_terminal, tp_args = list(FUN = mysummary, fill = "white"), gp = gpar(fontfamily = "inconsolata"))
In addition to this simple text-based table, you can also produce more elaborate tables, e.g., via
ggplot2
andgtable
as in the following plot taken from: Seibold, Hothorn, Zeileis (2019). "Generalised Linear Model Trees with Global Additive Effects." Advances in Data Analysis and Classification, 13, 703-725. doi:10.1007/s11634-018-0342-1The code is a little bit involved but available in the replication materials of the article. Specifically, you need these two files:
这篇关于partykit:当包含不相等的回归者的名称长度时,在终端节点中证明文本的正确性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!