当需要一维数组时传递列向量 y [英] A column-vector y was passed when a 1d array was expected

查看:10
本文介绍了当需要一维数组时传递列向量 y的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从 sklearn.ensemble 中拟合 RandomForestRegressor.

I need to fit RandomForestRegressor from sklearn.ensemble.

forest = ensemble.RandomForestRegressor(**RF_tuned_parameters)
model = forest.fit(train_fold, train_y)
yhat = model.predict(test_fold)

这段代码一直有效,直到我对数据进行了一些预处理 (train_y).错误消息说:

This code always worked until I made some preprocessing of data (train_y). The error message says:

DataConversionWarning:当需要一维数组时,传递了列向量 y.请将 y 的形状更改为 (n_samples,),例如使用 ravel().

DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().

model = Forest.fit(train_fold, train_y)

model = forest.fit(train_fold, train_y)

以前 train_y 是一个系列,现在它是 numpy 数组(它是一个列向量).如果我应用 train_y.ravel(),那么它会变成一个行向量并且不会出现错误消息,整个预测步骤需要很长时间(实际上它永远不会完成......).

Previously train_y was a Series, now it's numpy array (it is a column-vector). If I apply train_y.ravel(), then it becomes a row vector and no error message appears, through the prediction step takes very long time (actually it never finishes...).

RandomForestRegressor 的文档中,我发现 train_y 应该定义为 y : array-like, shape = [n_samples] or [n_samples, n_outputs]知道如何解决这个问题吗?

In the docs of RandomForestRegressor I found that train_y should be defined as y : array-like, shape = [n_samples] or [n_samples, n_outputs] Any idea how to solve this issue?

推荐答案

更改这一行:

model = forest.fit(train_fold, train_y)

到:

model = forest.fit(train_fold, train_y.values.ravel())

.values 将给出数组中的值.(形状:(n,1)

.values will give the values in an array. (shape: (n,1)

.ravel 将该数组形状转换为 (n, )

.ravel will convert that array shape to (n, )

这篇关于当需要一维数组时传递列向量 y的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆