ValueError:x和y的大小必须相同 [英] ValueError: x and y must be the same size

查看:937
本文介绍了ValueError:x和y的大小必须相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import numpy as np
import pandas as pd
import matplotlib.pyplot as pt

data1 = pd.read_csv('stage1_labels.csv')

X = data1.iloc[:, :-1].values
y = data1.iloc[:, 1].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
label_X = LabelEncoder()
X[:,0] = label_X.fit_transform(X[:,0])
encoder = OneHotEncoder(categorical_features = [0])
X = encoder.fit_transform(X).toarray()

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train,y_test = train_test_split(X, y, test_size = 0.4, random_state = 0)

#fitting Simple Regression to training set

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

#predecting the test set results
y_pred = regressor.predict(X_test)

#Visualization of the training set results
pt.scatter(X_train, y_train, color = 'red')
pt.plot(X_train, regressor.predict(X_train), color = 'green')
pt.title('salary vs yearExp (Training set)')
pt.xlabel('years of experience')
pt.ylabel('salary')
pt.show()

在执行上述代码时,我需要帮助来了解错误.错误如下:

I need a help understanding the error in while executing the above code. Below is the error:

"raise ValueError(" x和y的大小必须相同)"

"raise ValueError("x and y must be the same size")"

我有1398行2列的.csv文件.我已将40%作为y_test设置,因为在上面的代码中可见.

I have .csv file with 1398 rows and 2 column. I have taken 40% as y_test set, as it is visible in the above code.

推荐答案

打印X_train形状.你看到了什么?我敢打赌X_train是2d(单列矩阵),而y_train 1d(矢量).反过来,您得到不同的大小.

Print X_train shape. What do you see? I'd bet X_train is 2d (matrix with a single column), while y_train 1d (vector). In turn you get different sizes.

我认为使用X_train[:,0]进行绘制(这是错误的来源)应该可以解决问题

I think using X_train[:,0] for plotting (which is from where the error originates) should solve the problem

这篇关于ValueError:x和y的大小必须相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆