AxisError:计算 AUC 时,轴 1 超出维度 1 数组的范围 [英] AxisError: axis 1 is out of bounds for array of dimension 1 when calculating AUC
问题描述
我有一个分类问题,我有一个 8x8 图像的像素值和图像代表的数字,我的任务是使用 RandomForestClassifier 根据像素值预测数字('Number' 属性).数值的取值范围为0-9.
from sklearn.ensemble import RandomForestClassifier从 sklearn.metrics 导入 roc_auc_scoreForest_model = RandomForestClassifier(n_estimators=100,random_state=42)Forest_model.fit(train_df[input_var], train_df[target])test_df['forest_pred'] = forest_model.predict_proba(test_df[input_var])[:,1]roc_auc_score(test_df['Number'], test_df['forest_pred'],average = 'macro', multi_class="ovr")
这里抛出一个AxisError.
<前>回溯(最近一次调用最后一次):文件dap_hazi_4.py",第 44 行,在roc_auc_score(test_df['Number'],test_df['forest_pred'],average = 'macro', multi_class="ovo")文件/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py",第 383 行,在 roc_auc_score多类、平均值、样本权重)文件/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py",第440行,_multiclass_roc_auc_score如果不是 np.allclose(1, y_score.sum(axis=1)):文件/home/balint/.local/lib/python3.6/site-packages/numpy/core/_methods.py",第 38 行,在 _sum 中返回 umr_sum(a,axis, dtype, out, keepdims, initial, where)AxisError:轴 1 超出维度 1 数组的范围实际上,由于您的问题是多类问题,因此标签必须是单热编码的.当标签是单热编码时,multi_class"参数起作用.通过提供单热编码标签,您可以解决错误.
假设,您有 100 个具有 5 个唯一类别的测试标签,那么您的矩阵大小(测试标签的)必须是 (100,5) NOT (100,1)
I have a classification problem where I have the pixels values of an 8x8 image and the number the image represents and my task is to predict the number('Number' attribute) based on the pixel values using RandomForestClassifier. The values of the number values can be 0-9.
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score
forest_model = RandomForestClassifier(n_estimators=100, random_state=42)
forest_model.fit(train_df[input_var], train_df[target])
test_df['forest_pred'] = forest_model.predict_proba(test_df[input_var])[:,1]
roc_auc_score(test_df['Number'], test_df['forest_pred'], average = 'macro', multi_class="ovr")
Here it throws an AxisError.
Traceback (most recent call last): File "dap_hazi_4.py", line 44, in roc_auc_score(test_df['Number'], test_df['forest_pred'], average = 'macro', multi_class="ovo") File "/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py", line 383, in roc_auc_score multi_class, average, sample_weight) File "/home/balint/.local/lib/python3.6/site-packages/sklearn/metrics/_ranking.py", line 440, in _multiclass_roc_auc_score if not np.allclose(1, y_score.sum(axis=1)): File "/home/balint/.local/lib/python3.6/site-packages/numpy/core/_methods.py", line 38, in _sum return umr_sum(a, axis, dtype, out, keepdims, initial, where) AxisError: axis 1 is out of bounds for array of dimension 1
Actually, as your problem is multi-class the labels must be one-hot encoded. When labels are one-hot encoded then the 'multi_class' arguments work. By providing one-hot encoded labels you can resolve the error.
Suppose, you have 100 test labels with 5 unique classes then your matrix size(test label's) must be (100,5) NOT (100,1)
这篇关于AxisError:计算 AUC 时,轴 1 超出维度 1 数组的范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!