RandomForestClassifier 没有属性变换,那么如何得到预测呢? [英] RandomForestClassifier has no attribute transform, so how to get predictions?

查看:109
本文介绍了RandomForestClassifier 没有属性变换,那么如何得到预测呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从 RandomForestClassifier 中获得预测?松散地遵循最新的文档这里,我的代码看起来像...

# 将数据分成训练集和测试集(30% 用于测试)SPLIT_SEED = 64 # 一些常量种子只是为了可重复火车比率 = 0.75(trainingData, testData) = df.randomSplit([TRAIN_RATIO, 1-TRAIN_RATIO], 种子=SPLIT_SEED)打印(f训练集({trainingData.count()}):")trainingData.show(n=3)打印(f测试集({testData.count()}):")testData.show(n=3)# 训练一个随机森林模型.rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)rf.fit(trainingData)#print(rf.featureImportances)preds = rf.transform(testData)

运行时出现错误

<块引用>

AttributeError: 'RandomForestClassifier' 对象没有属性 'transform'

检查 python api docs,我看不到任何与从训练模型生成预测有关的内容(也没有与此相关的特征重要性).对 mllib 没有太多经验,所以不知道该怎么做.有更多经验的人知道在这里做什么吗?

解决方案

通过仔细查看文档

<预><代码>>>>模型 = rf.fit(td)>>>模型.特征重要性稀疏向量(1,{0:1.0})>>>allclose(model.treeWeights, [1.0, 1.0, 1.0])真的>>>test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])>>>结果 = model.transform(test0).head()>>>结果预测

您会注意到 rf.fit 返回拟合模型与原始 RandomForestClassifier 类不同.

并且模型将具有转换方法和特征重要性

所以在你的代码中

# 训练一个随机森林模型.rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)模型 = rf.fit(trainingData)#print(rf.featureImportances)preds = model.transform(testData)

How do you get predictions out of a RandomForestClassifier? Loosely following the latest docs here, my code looks like...

# Split the data into training and test sets (30% held out for testing)
SPLIT_SEED = 64  # some const seed just for reproducibility
TRAIN_RATIO = 0.75
(trainingData, testData) = df.randomSplit([TRAIN_RATIO, 1-TRAIN_RATIO], seed=SPLIT_SEED)
print(f"Training set ({trainingData.count()}):")
trainingData.show(n=3)
print(f"Test set ({testData.count()}):")
testData.show(n=3)

# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)

rf.fit(trainingData)
#print(rf.featureImportances)

preds = rf.transform(testData)

When running this, I get the error

AttributeError: 'RandomForestClassifier' object has no attribute 'transform'

Examining the python api docs, I see nothing that look like it relates to generating predictions from the trained model (nor feature importance for that matter). Not much experience with mllib, so not sure what to make of this. Anyone with more experience know what to do here?

解决方案

by looking closely to the documentation

>>> model = rf.fit(td)
>>> model.featureImportances
SparseVector(1, {0: 1.0})
>>> allclose(model.treeWeights, [1.0, 1.0, 1.0])
True
>>> test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
>>> result = model.transform(test0).head()
>>> result.prediction

you will notice the rf.fit return fitted models which is different than the original RandomForestClassifier class.

And the model will have the method to transform and also feature importance

so in your code

# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)

model = rf.fit(trainingData)
#print(rf.featureImportances)

preds = model.transform(testData)

这篇关于RandomForestClassifier 没有属性变换,那么如何得到预测呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆