在RandomForestRegressor中不支持连续错误 [英] Got continuous is not supported error in RandomForestRegressor

查看:167
本文介绍了在RandomForestRegressor中不支持连续错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想做一个简单的RandomForestRegressor示例.但是在测试准确性时,我得到了这个错误

I'm just trying to do a simple RandomForestRegressor example. But while testing the accuracy I get this error

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in precision_score(y_true,y_pred,normalize,sample_weight) 177 178#计算每种可能表示的准确性 -> 179 y_type,y_true,y_pred = _check_targets(y_true,y_pred) 180如果y_type.startswith('multilabel'): 181 differenting_labels = count_nonzero(y_true-y_pred,axis = 1)

in accuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # Compute accuracy for each possible representation --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith('multilabel'): 181 differing_labels = count_nonzero(y_true - y_pred, axis=1)

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

_check_targets中的

(y_true,y_pred) 90 if(y_type不在["binary","multiclass","multilabel-indicator"中, 91"multilabel-sequences"]): ---> 92引发ValueError(不支持{0}".format(y_type)) 93 94 if y_type in ["binary","multiclass"]:

in _check_targets(y_true, y_pred) 90 if (y_type not in ["binary", "multiclass", "multilabel-indicator", 91 "multilabel-sequences"]): ---> 92 raise ValueError("{0} is not supported".format(y_type)) 93 94 if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported

这是数据示例.我无法显示真实数据.

This is the sample of the data. I can't show the real data.

target, func_1, func_2, func_2, ... func_200
float, float, float, float, ... float

这是我的代码.

import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import tree

train = pd.read_csv('data.txt', sep='\t')

labels = train.target
train.drop('target', axis=1, inplace=True)
cat = ['cat']
train_cat = pd.get_dummies(train[cat])

train.drop(train[cat], axis=1, inplace=True)
train = np.hstack((train, train_cat))

imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(train)
train = imp.transform(train)

x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)

clf = RandomForestRegressor(n_estimators=10)

clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
accuracy_score(y_test, y_pred) # This is where I get the error.

推荐答案

这是因为

It's because accuracy_score is for classification tasks only. For regression you should use something different, for example:

clf.score(X_test, y_test)

其中X_test是样本,y_test是相应的地面真实值.它将在内部计算预测.

Where X_test is samples, y_test is corresponding ground truth values. It will compute predictions inside.

这篇关于在RandomForestRegressor中不支持连续错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆