J48树(RWeka)中的属性及其值 [英] Properties and their values out of J48 tree (RWeka)

查看:396
本文介绍了J48树(RWeka)中的属性及其值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果运行以下命令:

library(RWeka) 
data(iris) 
res = J48(Species ~., data = iris)

res将是从Weka_tree继承的类J48的列表.如果您打印

res will be a list of class J48 inheriting from Weka_tree. If you print it

R> res
J48 pruned tree
------------------

Petal.Width <= 0.6: setosa (50.0)
Petal.Width > 0.6
|   Petal.Width <= 1.7
|   |   Petal.Length <= 4.9: versicolor (48.0/1.0)
|   |   Petal.Length > 4.9
|   |   |   Petal.Width <= 1.5: virginica (3.0)
|   |   |   Petal.Width > 1.5: versicolor (3.0/1.0)
|   Petal.Width > 1.7: virginica (46.0/1.0)

Number of Leaves  :     5

Size of the tree :  9

我想按从右到左的顺序获取属性及其值.因此,在这种情况下:

I would like to get the properties and their values by their order from right to left. So for this case:

Petal.Width, Petal.Width, Petal.Length, Petal.Length.

我试图将res输入一个因子并运行命令:

I tried to enter res to a factor and to run the command:

str_extract(paste0(x, collapse=""), perl("(?<=\\|)[A-Za-z]+(?=\\|)"))

没有成功. 只是要记住,我们应该忽略左边的字符.

with no success. Just to remember that we should ignore the left around characters.

推荐答案

一种方法是将RWekaJ48对象转换为partykitparty对象.您只需要按as.party(res)即可,这将为您完成所有解析,并返回一个更易于与标准化提取器功能等配合使用的结构.

One way to do this is to convert the J48 object from RWeka to a party object from partykit. You just need to as as.party(res) and this does all the parsing for you and returns a structure that is easier to work with with standardized extractor functions etc.

特别地,您可以使用在其他讨论中给出的关于ctree对象等的所有建议.请参见

In particular you can then use all advice given in other discussions about ctree objects etc. See

  • 确定参与方中所有不同的变量ctree nodel

    而且我认为以下内容至少应满足您的要求:

    And I think the following should do at least part of what you want:

    library("partykit")
    pres <- as.party(res)
    partykit:::.list.rules.party(pres)
    ##                                                                                  2 
    ##                                                               "Petal.Width <= 0.6" 
    ##                                                                                  5 
    ##                     "Petal.Width > 0.6 & Petal.Width <= 1.7 & Petal.Length <= 4.9" 
    ##                                                                                  7 
    ## "Petal.Width > 0.6 & Petal.Width <= 1.7 & Petal.Length > 4.9 & Petal.Width <= 1.5" 
    ##                                                                                  8 
    ##  "Petal.Width > 0.6 & Petal.Width <= 1.7 & Petal.Length > 4.9 & Petal.Width > 1.5" 
    ##                                                                                  9 
    ##                                            "Petal.Width > 0.6 & Petal.Width > 1.7" 
    

    更新:操作人员将我与名单外的联系人联系在一起,询问相关问题,要求提供树的特定印刷表示形式.我将我的解决方案包括在这里,以防它对其他人有用.

    Update: The OP contacted me off-list for a related question, asking for a specific printed representation of the tree. I'm including my solution here in case it is useful for someone else.

    他想用()符号表示层次结构级别以及拆分变量的名称.一种方法是(1)提取基础数据的变量名:

    He wanted to have ( ) symbols signalling the hierarchy levels plus the names of the splitting variables. One way to do so would be to (1) extract variable names of the underlying data:

    nam <- names(pres$data)
    

    (2)将树的递归节点结构转换为平面列表(这对于构造所需的字符串有些方便):

    (2) Turn the recursive node structure of the tree into a flat list (which is somewhat more convenient for constructing the desired string):

    tr <- as.list(pres$node)
    

    (3a)初始化字符串:

    (3a) Initialize the string:

    str <- "("
    

    (3b)递归在字符串中添加方括号和/或变量名:

    (3b) Recursively add brackets and/or variable names to the string:

    update_str <- function(x) {
       if(is.null(x$kids)) {
         str <<- paste(str, ")")
       } else {
         str <<- paste(str, nam[x$split$varid], "(")
         for(i in x$kids) update_str(tr[[i]])
       }
    }
    

    (3c)从根节点开始调用递归:

    (3c) Call the recursion, starting from the root node:

    update_str(tr[[1]])
    str
    ## [1] "( Petal.Width ( ) Petal.Width ( Petal.Length ( ) Petal.Width ( ) ) )"
    

    这篇关于J48树(RWeka)中的属性及其值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆