R游侠confusion.matrix大于使用expand.grid和purrr :: pmap时的假定值 [英] R ranger confusion.matrix is larger than supposed when using expand.grid and purrr::pmap
问题描述
很抱歉,今天所有与purrr相关的问题,仍在设法弄清楚如何有效地使用它.
Sorry for all the purrr related questions today, still trying to figure out how to make efficient use of it.
因此,在SO的一些帮助下,我设法根据来自data.frame的输入值来运行随机森林巡游者模型.这可以使用purrr::pmap
完成.但是,我不明白如何从调用的函数生成返回值.考虑以下示例:
So with some help from SO I managed to get random forest ranger model running based on input values coming from a data.frame. This is accomplished using purrr::pmap
. However, I don't understand how the return values are generated from the called function. Consider this example:
library(ranger)
data(iris)
Input_list <- list(iris1 = iris, iris2 = iris) # let's assume these are different input tables
# the data.frame with the values for the function
hyper_grid <- expand.grid(
Input_table = names(Input_list),
mtry = c(1,2),
Classification = TRUE,
Target = "Species")
> hyper_grid
Input_table mtry Classification Target
1 iris1 1 TRUE Species
2 iris2 1 TRUE Species
3 iris1 2 TRUE Species
4 iris2 2 TRUE Species
# the function to be called for each row of the `hyper_grid`df
fit_and_extract_metrics <- function(Target, Input_table, Classification, mtry,...) {
RF_train <- ranger(
dependent.variable.name = Target,
mtry = mtry,
data = Input_list[[Input_table]], # referring to the named object in the list
classification = Classification) # otherwise regression is performed
RF_train$confusion.matrix
}
# the pmap call using a row of hyper_grid and the function in parallel
purrr::pmap(hyper_grid, fit_and_extract_metrics)
由于iris$Species
中有3个级别,它应该返回3 * 3混淆矩阵的4倍,而是返回巨型混淆矩阵.有人可以向我解释发生了什么事吗?
It is supposed to return 4 times a 3*3 confusion matrix, as there are 3 levels in iris$Species
, instead it returns giant confusion matrices. Can someone explain to me what is going on?
第一行:
> purrr::pmap(hyper_grid, fit_and_extract_metrics)
[[1]]
predicted
true 4.4 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6 6.1 6.2 6.3 6.4
4.3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.4 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.5 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.6 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.7 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.8 0 0 1 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4.9 0 0 1 2 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0
5 0 0 0 1 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5.1 0 0 0 0 0 8 0 0 0 1 0 0 0 0 0 0 0 0 0
推荐答案
此处的问题是因为传递给函数的参数是级别,而不是字符.这触发了护林员的功能.要解决此问题,您只需在expand.grid
中设置stringsAsFactors = FALSE
:
The problem here was because the arguments passed to the function were levels, not characters. This tripped up the ranger function. To solve this, all you need to do is set stringsAsFactors = FALSE
in the expand.grid
:
hyper_grid <- expand.grid(
Input_table = names(Input_list),
mtry = c(1,2),
Classification = TRUE,
Target = "Species", stringsAsFactors = FALSE)
您将获得:
[[1]]
predicted
true setosa versicolor virginica
setosa 50 0 0
versicolor 0 46 4
virginica 0 4 46
[[2]]
predicted
true setosa versicolor virginica
setosa 50 0 0
versicolor 0 46 4
virginica 0 5 45
[[3]]
predicted
true setosa versicolor virginica
setosa 50 0 0
versicolor 0 47 3
virginica 0 3 47
[[4]]
predicted
true setosa versicolor virginica
setosa 50 0 0
versicolor 0 47 3
virginica 0 3 47
这篇关于R游侠confusion.matrix大于使用expand.grid和purrr :: pmap时的假定值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!