R游侠confusion.matrix大于使用expand.grid和purrr :: pmap时的假定值 [英] R ranger confusion.matrix is larger than supposed when using expand.grid and purrr::pmap

查看:145
本文介绍了R游侠confusion.matrix大于使用expand.grid和purrr :: pmap时的假定值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很抱歉,今天所有与purrr相关的问题,仍在设法弄清楚如何有效地使用它.

Sorry for all the purrr related questions today, still trying to figure out how to make efficient use of it.

因此,在SO的一些帮助下,我设法根据来自data.frame的输入值来运行随机森林巡游者模型.这可以使用purrr::pmap完成.但是,我不明白如何从调用的函数生成返回值.考虑以下示例:

So with some help from SO I managed to get random forest ranger model running based on input values coming from a data.frame. This is accomplished using purrr::pmap. However, I don't understand how the return values are generated from the called function. Consider this example:

library(ranger)
data(iris)
Input_list <- list(iris1 = iris, iris2 = iris)  # let's assume these are different input tables

# the data.frame with the values for the function
hyper_grid <- expand.grid(
  Input_table = names(Input_list),
  mtry = c(1,2),
  Classification = TRUE,
  Target = "Species")

> hyper_grid
  Input_table mtry Classification  Target
1       iris1    1           TRUE Species
2       iris2    1           TRUE Species
3       iris1    2           TRUE Species
4       iris2    2           TRUE Species

# the function to be called for each row of the `hyper_grid`df
fit_and_extract_metrics <- function(Target, Input_table, Classification, mtry,...) {
  RF_train <- ranger(
    dependent.variable.name = Target, 
    mtry = mtry,
    data = Input_list[[Input_table]],  # referring to the named object in the list
    classification = Classification)  # otherwise regression is performed

  RF_train$confusion.matrix
}

# the pmap call using a row of hyper_grid and the function in parallel
purrr::pmap(hyper_grid, fit_and_extract_metrics)

由于iris$Species中有3个级别,它应该返回3 * 3混淆矩阵的4倍,而是返回巨型混淆矩阵.有人可以向我解释发生了什么事吗?

It is supposed to return 4 times a 3*3 confusion matrix, as there are 3 levels in iris$Species, instead it returns giant confusion matrices. Can someone explain to me what is going on?

第一行:

> purrr::pmap(hyper_grid, fit_and_extract_metrics)
[[1]]
     predicted
true  4.4 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6 6.1 6.2 6.3 6.4
  4.3   1   0   0   0 0   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.4   1   1   1   0 0   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.5   1   0   0   0 0   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.6   0   1   1   1 1   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.7   1   0   1   0 0   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.8   0   0   1   3 1   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  4.9   0   0   1   2 2   0   0   0   0   0   0   0   0   0 1   0   0   0   0
  5     0   0   0   1 9   0   0   0   0   0   0   0   0   0 0   0   0   0   0
  5.1   0   0   0   0 0   8   0   0   0   1   0   0   0   0 0   0   0   0   0

推荐答案

此处的问题是因为传递给函数的参数是级别,而不是字符.这触发了护林员的功能.要解决此问题,您只需在expand.grid中设置stringsAsFactors = FALSE:

The problem here was because the arguments passed to the function were levels, not characters. This tripped up the ranger function. To solve this, all you need to do is set stringsAsFactors = FALSE in the expand.grid:

hyper_grid <- expand.grid(
    Input_table = names(Input_list),
    mtry = c(1,2),
    Classification = TRUE,
    Target = "Species", stringsAsFactors = FALSE)

您将获得:

[[1]]
            predicted
true         setosa versicolor virginica
  setosa         50          0         0
  versicolor      0         46         4
  virginica       0          4        46

[[2]]
            predicted
true         setosa versicolor virginica
  setosa         50          0         0
  versicolor      0         46         4
  virginica       0          5        45

[[3]]
            predicted
true         setosa versicolor virginica
  setosa         50          0         0
  versicolor      0         47         3
  virginica       0          3        47

[[4]]
            predicted
true         setosa versicolor virginica
  setosa         50          0         0
  versicolor      0         47         3
  virginica       0          3        47

这篇关于R游侠confusion.matrix大于使用expand.grid和purrr :: pmap时的假定值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆