是否可以检索由混淆矩阵标识的误报/误报? [英] Is it possible to retrieve False Positives/ False Negatives identified by a confusion Matrix?

查看：135 发布时间：2020/5/4 10:06:20 python matrix machine-learning

本文介绍了是否可以检索由混淆矩阵标识的误报/误报?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Sckit学习，并且正在使用混淆矩阵来更深入地了解我的算法的性能:

I am using Sckit-learn and am using a Confusion Matrix to get more insight into how my algorithm is performing:

X_train, X_test, Y_train, Y_test = train_test_split(keywords_list, 

label_list, test_size=0.33, random_state=42)

pipeline.fit(X_train, Y_train)

pred = pipeline.predict(X_test)

print(confusion_matrix(Y_test, pred))

我得到这样的输出:

[[1011   72]
[ 154 1380]]

我认为这些矩阵的格式如下:

Which I assume follows the format for these Matrixes:

TP|FP
FN|TN

是否可以检索被分类为误报和误报的值?了解这些数据的外观将对我的工作有所帮助.不用说我是Sckit-Learn的新手.

Is it possible to retrieve the values that are being classified as false positives and False Negatives? Knowing what that data looks like would be helpful towards my work. it goes without saying I am new to Sckit-Learn.

亚历山德罗(Alessandro)告诉我Y_test != pred将在混淆矩阵中返回我所有的假阳性/阴性结果，从而给出了很好的建议.

Alessandro gave good advice by informing me that Y_test != pred would return all of my false positives/negatives in the confusion matrix.

我应该在最初的问题中提到的一个因素是，我正在将文本数据归类为二进制标签. (例如火腿/垃圾邮件)，我想将它们彼此分开识别.我当前用于提取假阴性的代码采用以下形式:

One factor that I should have mentioned in my original question is that I am classifying textual data under binary labels. (E.g. Ham/Spam) and I want to identify them seperately from each other. My current code for extracting false negatives is taking the form of:

false_neg = open('false_neg.csv', 'w')
falsen_list = X_test[(Y_test == 'Spam') and (pred == 'Ham')] #False Negatives
wr2 = csv.writer(false_neg, quoting=csv.QUOTE_ALL)
for x in falsen_list:
    wr2.writerow([x])

不幸的是，这引发了错误:

Unfortunately, this throws an error:

  Traceback (most recent call last):
  File "/home/noname365/PycharmProjects/MLCorpusBlacklist/CorpusML_training.py", line 171, in <module>
    falsen_list = X_test[(Y_test == 'blacklisted') and (pred == 'clean')] #False Negatives
  File "/home/noname365/virtualenvs/env35/lib/python3.5/site-packages/pandas/core/generic.py", line 731, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我在这里吗?

是否可以检索由混淆矩阵标识的误报/误报? [英] Is it possible to retrieve False Positives/ False Negatives identified by a confusion Matrix?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

是否可以检索由混淆矩阵标识的误报/误报? [英] Is it possible to retrieve False Positives/ False Negatives identified by a confusion Matrix?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭