R:从函数获取规则 [英] R: Obtaining Rules from a Function

查看:37
本文介绍了R:从函数获取规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用R编程语言.我使用了"rpart"库并使用一些数据拟合决策树:

来自上一个问题的

 #:https://stackoverflow.com/questions/65678552/r-changing-plot-sizes库(rpart)car.test.frame $ Reliability = as.factor(car.test.frame $ Reliability)z.auto<-rpart(可靠性〜.,car.test.frame)情节(z.auto)文字(z.auto,use.n = TRUE,xpd = TRUE,cex = .8) 

很好,但是我正在寻找一种更简单的方法来汇总该树的结果,以防树变得太大,太复杂和混乱(并且无法可视化).我在这里找到了另一个stackoverflow帖子,其中显示了如何获取规则列表:

I am using the R programming language. I used the "rpart" library and fit a decision tree using some data:

#from a previous question : https://stackoverflow.com/questions/65678552/r-changing-plot-sizes 

    library(rpart)

   car.test.frame$Reliability = as.factor(car.test.frame$Reliability)
    
    z.auto <- rpart(Reliability ~ ., car.test.frame)
    plot(z.auto)
    text(z.auto, use.n=TRUE, xpd=TRUE, cex=.8)

This is good, but I am looking for an easier way to summarize the results of this tree in case the tree becomes too big, complicated and cluttered (and impossible to visualize). I found another stackoverflow post over here that shows how to obtain a listing of rules: Extracting Information from the Decision Rules in rpart package

library(party)
library(partykit)

party_obj <- as.party.rpart(z.auto, data = TRUE)
decisions <- partykit:::.list.rules.party(party_obj)
cat(paste(decisions, collapse = "\n"))

This returns the following list of rules (each line is a rule corresponding to the plot of "z.auto"):

    Country %in% c("NA", "Germany", "Korea", "Mexico", "Sweden", "USA") & Weight >= 3167.5
Country %in% c("NA", "Germany", "Korea", "Mexico", "Sweden", "USA") & Weight < 3167.5
Country %in% c("NA", "Japan", "Japan/USA")> 

However, from this list, it is not possible to know which rule results in which value of "Reliability". For the time being, I am manually interpreting the tree and manually tracing each rule to the result, but is there a way to add to each line "the corresponding value of reliability"?

e.g. Is it possible to produce something like this?

Country %in% c("NA", "Germany", "Korea", "Mexico", "Sweden", "USA") & Weight >= 3167.5 then reliability = 3,7,4,0

(note1: I am also not sure why the countries are appearing as "befgh" instead of their actual names.

note2: I am aware that there is a library "rpart.plot" that has a simpler way of obtaining these rules. However, I am using a computer that does not have internet access or a usb port, therefore I can not download the rpart.plot library. I have R with a few preloaded packages. I am trying to obtain the decision rules using libraries such as rpart, dplyr, purr, party, partykit, functions from base R)

Thanks

解决方案

This isn't my area of expertise, but perhaps this function (from https://www.togaware.com/datamining/survivor/Convert_Tree.html) will do what you want to do:

library(rpart)
car.test.frame$Reliability = as.factor(car.test.frame$Reliability)
z.auto <- rpart(Reliability ~ ., car.test.frame)
plot(z.auto, margin = 0.25)
text(z.auto, pretty = TRUE, cex = 0.8,
     splits = TRUE, use.n = TRUE, all = FALSE)

list.rules.rpart <- function(model)
{
  if (!inherits(model, "rpart")) stop("Not a legitimate rpart tree")
  #
  # Get some information.
  #
  frm     <- model$frame
  names   <- row.names(frm)
  ylevels <- attr(model, "ylevels")
  ds.size <- model$frame[1,]$n
  #
  # Print each leaf node as a rule.
  #
  for (i in 1:nrow(frm))
  {
    if (frm[i,1] == "<leaf>")
    {
      # The following [,5] is hardwired - needs work!
      cat("\n")
      cat(sprintf(" Rule number: %s ", names[i]))
      cat(sprintf("[yval=%s cover=%d (%.0f%%) prob=%0.2f]\n",
                  ylevels[frm[i,]$yval], frm[i,]$n,
                  round(100*frm[i,]$n/ds.size), frm[i,]$yval2[,5]))
      pth <- path.rpart(model, nodes=as.numeric(names[i]), print.it=FALSE)
      cat(sprintf("   %s\n", unlist(pth)[-1]), sep="")
    }
  }
}

list.rules.rpart(z.auto)
>Rule number: 4 [yval=3 cover=10 (20%) prob=0.00]
>   Country=Germany,Korea,Mexico,Sweden,USA
>   Weight>=3168
>
> Rule number: 5 [yval=2 cover=18 (37%) prob=4.00]
>   Country=Germany,Korea,Mexico,Sweden,USA
>   Weight< 3168
>
> Rule number: 3 [yval=5 cover=21 (43%) prob=2.00]
>   Country=Japan,Japan/USA

这篇关于R:从函数获取规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆