使用 joblib 在 sklearn 中重用由 cross_val_score 拟合的模型 [英] Reusing model fitted by cross_val_score in sklearn using joblib

查看:75
本文介绍了使用 joblib 在 sklearn 中重用由 cross_val_score 拟合的模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 python 中创建了以下函数:

I created the following function in python:

def cross_validate(algorithms, data, labels, cv=4, n_jobs=-1):
    print "Cross validation using: "
    for alg, predictors in algorithms:
        print alg
        print
        # Compute the accuracy score for all the cross validation folds. 
        scores = cross_val_score(alg, data, labels, cv=cv, n_jobs=n_jobs)
        # Take the mean of the scores (because we have one for each fold)
        print scores
        print("Cross validation mean score = " + str(scores.mean()))

        name = re.split('\(', str(alg))
        filename = str('%0.5f' %scores.mean()) + "_" + name[0] + ".pkl"
        # We might use this another time 
        joblib.dump(alg, filename, compress=1, cache_size=1e9)  
        filenameL.append(filename)
        try:
            move(filename, "pkl")
        except:
            os.remove(filename) 

        print 
    return

我认为为了进行交叉验证,sklearn 必须适合您的功能.

I thought that in order to do cross validation, sklearn had to fit your function.

但是,当我稍后尝试使用它时(f 是我上面在 joblib.dump(alg, filename, compress=1, cache_size=1e9)) 中保存的 pkl 文件:

However, when I try to use it later (f is the pkl file I saved above in joblib.dump(alg, filename, compress=1, cache_size=1e9)):

alg = joblib.load(f)  
predictions = alg.predict_proba(train_data[predictors]).astype(float)

我在第一行中没有发现错误(因此看起来负载正在运行),但随后它告诉我 NotFittedError: Estimator not fit, 在利用模型之前调用fit. 在下一行.

I get no error in the first line (so it looks like the load is working), but then it tells me NotFittedError: Estimator not fitted, callfitbefore exploiting the model. on the following line.

我做错了什么?我不能重用适合计算交叉验证的模型吗?我看了保持适合在 scikits 中使用 cross_val_score 时的参数学习 但要么我不明白答案,要么不是我想要的.我想要的是用 joblib 保存整个模型,以便我以后可以使用它而无需重新拟合.

What am I doing wrong? Can't I reuse the model fitted to calculate the cross-validation? I looked at Keep the fitted parameters when using a cross_val_score in scikits learn but either I don't understand the answer, or it is not what I am looking for. What I want is to save the whole model with joblib so that I can the use it later without re-fitting.

推荐答案

您的模型未拟合的真正原因是函数 cross_val_score 在拟合副本之前首先复制您的模型:源链接

The real reason your model is not fitted is that the function cross_val_score first copies your model before fitting the copy : Source link

所以您的原始模型尚未安装.

So your original model has not been fitted.

这篇关于使用 joblib 在 sklearn 中重用由 cross_val_score 拟合的模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆