Python“数组索引过多" [英] Python "Too many indices for array"

查看:138
本文介绍了Python“数组索引过多"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用pandas在python中读取文件,然后将其保存在numpy数组中. 该文件的大小为11303402行x 10列. 我需要拆分数据以进行交叉验证,为此,我将数据切成11303402行x 9列示例和1个数组11303402行x 1 col标签. 以下是代码:

I am reading a file in python using pandas and then saving it in a numpy array. The file has the dimension of 11303402 rows x 10 columns. I need to split the data for cross validation and for that I sliced the data into 11303402 rows x 9 columns of examples and 1 array of 11303402 rows x 1 col of labels. The following is the code:

tdata=pd.read_csv('train.csv')
tdata.columns='Arrival_Time','Creation_Time','x','y','z','User','Model','Device','sensor','gt']

User_Data = np.array(tdata)
features = User_Data[:,0:9]
labels = User_Data[:,9:10]

该错误来自以下代码:

classes=np.unique(labels)
idx=labels==classes[0]
Yt=labels[idx]
Xt=features[idx,:]

在线:

Xt=features[idx,:]

它说数组索引太多"

所有3个数据集的形状为:

The shapes of all 3 data sets are:

print np.shape(tdata) = (11303402, 10)
print np.shape(features) = (11303402, 9)
print np.shape(labels) = (11303402, 1)

如果有人知道问题所在,请提供帮助.

If anyone knows the problem, please help.

推荐答案

问题在于idx具有形状(11303402,1),因为逻辑比较返回的数组形状与labels相同.这两个维度使用features中的所有索引.快速解决方法是

The problem is idx has shape (11303402,1) because the logical comparison returns an array of the same shape as labels. These two dimensions use all of the indexes in features. The quick work around is

Xt=features[idx[:,0],:]

这篇关于Python“数组索引过多"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆