对测试数据集使用cross_val_predict [英] Using cross_val_predict against test data set

查看:299
本文介绍了对测试数据集使用cross_val_predict的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对在测试数据集中使用cross cross_val_predict感到困惑.

I'm confused about using cross cross_val_predict in a test data set.

我创建了一个简单的随机森林模型,并使用cross_val_predict进行了预测

I created a simple Random Forest model and used cross_val_predict to make predictions

from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import cross_val_predict, KFold

lr = RandomForestClassifier(random_state=1, class_weight="balanced", n_estimators=25, max_depth=6)
kf = KFold(train_df.shape[0], random_state=1)
predictions = cross_val_predict(lr,train_df[features_columns], train_df["target"], cv=kf)
predictions = pd.Series(predictions)

我对此处的下一步感到困惑,上面所学的如何使用我来对测试数据集进行预测?

I'm confused on the next step here, How do I use is learnt above to make predictions on the test data set?

推荐答案

正如@DmitryPolonskiy所评论的那样,必须对模型进行训练(使用fit方法),然后才能用于predict.

As @DmitryPolonskiy commented, the model has to be trained (with the fit method) before it can be used to predict.

# Train the model (a.k.a. `fit` training data to it).
lr.fit(train_df[features_columns], train_df["target"])
# Use the model to make predictions based on testing data.
y_pred = lr.predict(test_df[feature_columns])
# Compare the predicted y values to actual y values.
accuracy = (y_pred == test_df["target"]).mean()

cross_val_predict是一种交叉验证方法,可让您确定模型的准确性.看看 sklearn的交叉验证页面.

cross_val_predict is a method of cross validation, which lets you determine the accuracy of your model. Take a look at sklearn's cross-validation page.

这篇关于对测试数据集使用cross_val_predict的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆