Python中的数组TP,TN,FP和FN [英] arrays TP, TN, FP and FN in Python
问题描述
我的预测结果如下
TestArray
[1,0,0,0,1,0,1,...,1,0,1,1],
[1,0,1,0,0,1,0,...,0,1,1,1],
[0,1,1,1,1,1,0,...,0,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],
PredictionArray
[1,0,0,0,0,1,1,...,1,0,1,1],
[1,0,1,1,1,1,0,...,1,0,0,1],
[0,1,0,1,0,0,0,...,1,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],
这是我拥有的数组的大小
this is the size of the arrays that I have
TestArray.shape
Out[159]: (200, 24)
PredictionArray.shape
Out[159]: (200, 24)
我想获得这些阵列的TP,TN,FP和FN
I want to get TP, TN, FP and FN for these arrays
我尝试了这段代码
cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
结果
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
125 5 0 1
我检查了cm的形状
cm.shape
Out[168]: (17, 17)
125 + 5 + 0 + 1 = 131并且不等于列数我有200个
125 + 5 + 0 + 1 = 131 and that does not equal the number of columns I have which is 200
我希望有200个,因为数组中的每个单元格应该是TF,TN,FP,TP,所以总数应该是200
I am expecting to have 200 as each cell in the array suppose to be TF, TN, FP, TP so the total should be 200
该如何解决?
以下是问题的一个例子
import numpy as np
from sklearn.metrics import confusion_matrix
TestArray = np.array(
[
[1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,0,0,1],
[0,1,1,0,1,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1],
[1,0,1,1,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,1,1,1],
[0,0,0,0,1,1,0,1,1,0,0,1,0,1,1,0,1,1,1,1],
[1,0,0,1,1,1,0,1,1,0,1,0,0,1,1,0,0,1,0,0],
[1,1,1,0,0,1,0,0,1,1,0,1,0,1,1,1,1,1,0,1],
[0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,1],
[1,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,1,0],
[1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,0]
])
TestArray.shape
PredictionArray = np.array(
[
[0,0,0,1,1,1,1,0,0,0,1,0,0,0,1,0,1,0,1,1],
[0,1,0,0,1,0,1,1,0,0,0,1,1,0,0,1,1,0,0,1],
[1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0],
[0,1,0,1,0,0,1,0,0,1,0,1,1,0,0,1,0,0,1,1],
[0,0,1,0,0,1,0,1,1,1,0,1,1,1,0,0,1,1,0,1],
[1,0,0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,1,0],
[1,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,1,0,1],
[0,0,0,0,0,0,0,1,1,0,1,0,0,1,0,1,1,0,1,1],
[1,0,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0],
[1,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,0,0]
])
PredictionArray.shape
cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
输出为
5 0 2 0
= 5 + 0 + 2 + 0 = 7 !!
= 5+0+2+0 = 7 !!
数组中有20列和10行
There are 20 columns in the array and 10 rows
,但是cm总计为7 !!
but cm gives to total of 7!!
推荐答案
使用 np.argmax
矩阵时您输入的 sklearn.metrics.confusion_matrix
不再是二进制的,因为 np.argmax
返回索引的索引。第一次出现的最大值。在这种情况下,沿 axis = 1
。
When using np.argmax
the matrices that you input sklearn.metrics.confusion_matrix
isn't binary anymore, as np.argmax
returns the index of the first occuring maximum value. In this case along axis=1
.
您不会得到good'ol真实阳性/匹配,真负数/正确拒绝等,当您的预测不是二进制时。
You don't get the good'ol true-positives / hits, true-negatives / correct-rejections, etc., when your prediction isn't binary.
您应该找到 sum(sum(cm ))
确实等于200。
You should find that sum(sum(cm))
indeed equals 200.
如果数组的每个索引代表一个单独的预测,也就是说,您尝试获取TP / TN / FP / FN的总计200( 10 * 20
)个预测,结果为 0
或 1
进行预测,则可以在解析数组之前通过展平获得TP / TN / FP / FN到 confusion_matrix
。也就是说,您可以将 TestArray
和 PreditionArry
重塑为(200,)
,例如:
If each index of the arrays represents an individual prediction, i.e. you are trying to get TP/TN/FP/FN for a total of 200 (10 * 20
) predictions with the outcome of either 0
or 1
for each prediction, then you can obtain TP/TN/FP/FN by flattening the arrays before parsing them to confusion_matrix
. That is to say, you could reshape TestArray
and PreditionArry
to (200,)
, e.g.:
cm = confusion_matrix(TestArray.reshape(-1), PredictionArray.reshape(-1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN, FN, TP, FP, '=', TN + FN + TP + FP)
哪个回报
74 28 73 25 = 200
这篇关于Python中的数组TP,TN,FP和FN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!