在glmtree中代表20多个级别 [英] Represent more than 20 levels in a glmtree

查看:89
本文介绍了在glmtree中代表20多个级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,我正在使用R中的 glmtree()函数.我有一些20多个水平的因子变量.问题来自树的表示.由于某些变量中存在大量级别(即i_mode具有29个级别),因此某些叶子上的某些信息无法可视化.

Currently I am working with the glmtree() function in R. I have some factor variables with 20+ levels. The problem comes with the representation of the tree. There is some information at certain leafs that is impossible to visualise due to the large amount of levels in certain variables (i.e. i_mode has 29 levels).

一种可能的解决方案是虚拟化"游戏对象.这些水平.但是,如果可能的话,我宁愿不这样做.

One possible solution would be to "dummify" those levels. However, I'd rather not do it, if possible at all.

您知道一种可以使我以更易读的形式表示同一地块的方法吗?

Do you know a method in which I can represent the same plot in a more readable form?

有任何线索吗?

谢谢

推荐答案

我的感觉是,要理解这样的情节将具有挑战性,而且不仅仅限于标签问题.就我个人而言,我会尝试将这种因素分解为具有较少层次(但不一定是二进制)的更易理解的组.

My feeling is that it will be challenging to understand such a plot, also beyond the labeling issue. Personally, I would try to break down such a factor into more intelligible groups with fewer levels (not necessarily binary, though).

可以这么说,在树中绘制边缘标签的面板函数 edge_simple()具有一些有助于提高可读性的参数,例如,您可以更改其位置并更改字体大小.有关工作示例,请参见: R partykit :: ctree边缘上的偏移标签另外,您可以在学习树之前尝试简化因子水平.但是,恐怕只有29个级别,所有这些都可能无济于事.

Having said that, the panel function edge_simple() that draws the edge labels in the tree has some arguments that can help improve the readability, e.g., you can alternate their position and change the font size. For a worked example see: R partykit::ctree offset labels on edges Additionally you could try abbreviating the factor levels prior to learning the tree. However, with 29 levels all of this will probably not help much, I'm afraid.

这篇关于在glmtree中代表20多个级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆