使用先前保存的模型获得测试数据的分类准确性 [英] get Classification accuracy on test data using previous saved model

查看:246
本文介绍了使用先前保存的模型获得测试数据的分类准确性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Orange数据挖掘工具编写python脚本,以使用先前保存的模型(修补文件)获得测试数据的分类准确性.

I am using Orange data mining tool to write a python script to get classification accuracy on test data using a previous saved model(pickle file).

dataFile = "training.csv" 
data = Orange.data.Table(dataFile);
learner = Orange.classification.RandomForestLearner()
cf = learner(data)
#save the pickle file
with open("1.pkcls", "wb") as f:
    pickle.dump(cf, f)

#load the pickle file
with open("1.pkcls", "rb") as f:
    loadCF = pickle.load(f)
testFile = "testing.csv" 
test = Orange.data.Table(testFile);

learners = [1]
learners[0] = cf
result = Orange.evaluation.testing.TestOnTestData(data,test,learners)
# get classification accuracy
CAs = Orange.evaluation.CA(result)

我可以成功保存和加载模型,但是出现错误

I can successfully save and load the model but I had an error

    CAs = Orange.evaluation.CA(result)


File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/Orange/evaluation/scoring.py", line 39, in __new__
    return self(results, **kwargs)
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/Orange/evaluation/scoring.py", line 48, in __call__
    return self.compute_score(results, **kwargs)
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/Orange/evaluation/scoring.py", line 84, in compute_score
    return self.from_predicted(results, skl_metrics.accuracy_score)
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/Orange/evaluation/scoring.py", line 75, in from_predicted
    dtype=np.float64, count=len(results.predicted))
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/Orange/evaluation/scoring.py", line 74, in <genexpr>
    for predicted in results.predicted),
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 172, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/Users/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 82, in _check_targets
    "".format(type_true, type_pred))
ValueError: Can't handle mix of multiclass and continuous

我找到了解决此问题的方法,并通过删除成功地产生了分类准确性

I find a way to fix this problem and successfully generate the classification accuracy by deleting

cf = learner(data)

但是,如果删除此行代码,则无法训练模型并保存它,因为在保存和加载模型的代码之前,RandomForestLearner不会根据输入文件训练模型.

However, if I delete this line of code, I am unable to train a model and save it because RandomForestLearner does not train the model based on the input file before code of saving and loading model.

with open("1.pkcls", "wb") as f:
pickle.dump(cf, f)

#load the pickle file
with open("1.pkcls", "rb") as f:
loadCF = pickle.load(f)

有人知道是否可以先训练模型并将其另存为pickle文件.那我可以用它来测试另一个文件,以便以后获得分类的准确性吗?

Does anyone know if it is possible to train a model first and save it as a pickle file. Then I can use it to test another file to get classification accuracy later?

推荐答案

您不得在将分类器传递给TestOnTestData之前对其进行预训练(其名称应为TrainOnTrainAndTestOnTestData,即它将在其上调用拟合/训练步骤)拥有).

You must not pre-train the classifier before passing it to TestOnTestData (its name should be TrainOnTrainAndTestOnTestData, i.e. it invokes fitting/training step on its own).

不幸的是,目前还没有一种简便易行的显式方法可以通过在测试数据集上应用预先训练的分类器来创建Result实例.

Unfortunately there is no readily available explicit way to create a Result instance from an application of a pre-trained classifier(s) on a test dataset.

一种快速而肮脏的方法是对传递给TestOnTest数据的学习者"进行重击,以返回经过预先训练的模型

One quick and dirty way is to thunk the 'learners' passed to TestOnTest data to return the pre-trained models

results = Orange.evaluation.testing.TestOnTestData(data, test, [lambda testdata: loadCF])

这篇关于使用先前保存的模型获得测试数据的分类准确性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆