用于R中Logistic回归的confusionMatrix [英] confusionMatrix for logistic regression in R

查看:69
本文介绍了用于R中Logistic回归的confusionMatrix的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用我的训练数据和测试数据为我的逻辑回归计算两个混淆矩阵:

I want to calculate two confusion matrix for my logistic regression using my training data and my testing data:

logitMod <- glm(LoanStatus_B ~ ., data=train, family=binomial(link="logit"))

我将预测概率的阈值设置为0.5:

i set the threshold of predicted probability at 0.5:

confusionMatrix(table(predict(logitMod, type="response") >= 0.5,
                      train$LoanStatus_B == 1))

下面的代码对我的培训非常有效.但是,当我使用测试仪时:

And the the code below works well for my training set. However, when i use the test set:

confusionMatrix(table(predict(logitMod, type="response") >= 0.5,
                      test$LoanStatus_B == 1))

它给我一个

Error in table(predict(logitMod, type = "response") >= 0.5, test$LoanStatus_B == : all arguments must have the same length

这是为什么?我怎样才能解决这个问题?谢谢!

Why is this? How can I fix this? Thank you!

推荐答案

由于您忘记了提供新数据,因此我认为使用预测有问题.另外,您可以使用 caret 包中的函数 confusionMatrix 来计算和显示混淆矩阵,但是在调用之前无需列出结果.

I think there is a problem with the use of predict, since you forgot to provide the new data. Also, you can use the function confusionMatrix from the caret package to compute and display confusion matrices, but you don't need to table your results before that call.

在这里,我创建了一个包含代表性二进制目标变量的玩具数据集,然后我训练了一个与您所做的类似的模型.

Here, I created a toy dataset that includes a representative binary target variable and then I trained a model similar to what you did.

train <- data.frame(LoanStatus_B = as.numeric(rnorm(100)>0.5), b= rnorm(100), c = rnorm(100), d = rnorm(100))
logitMod <- glm(LoanStatus_B ~ ., data=train, family=binomial(link="logit"))

现在,您可以预测数据(例如,训练集),然后使用带有两个参数的 confusionMatrix():

  • 您的预测
  • 观察到的类别
  • Now, you can predict the data (for example, your training set) and then use confusionMatrix() that takes two arguments:

    • your predictions
    • the observed classes
    • library(caret)
      # Use your model to make predictions, in this example newdata = training set, but replace with your test set    
      pdata <- predict(logitMod, newdata = train, type = "response")
      
      # use caret and compute a confusion matrix
      confusionMatrix(data = as.numeric(pdata>0.5), reference = train$LoanStatus_B)
      

      这是结果

      Confusion Matrix and Statistics
      
                Reference
      Prediction  0  1
               0 66 33
               1  0  1
      
                     Accuracy : 0.67            
                       95% CI : (0.5688, 0.7608)
          No Information Rate : 0.66            
          P-Value [Acc > NIR] : 0.4625          
      

      这篇关于用于R中Logistic回归的confusionMatrix的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆