IndexError:数组,datetime字符串和numpy genfromtxt的索引太多 [英] IndexError: too many indices for array, datetime strings and numpy genfromtxt

查看:329
本文介绍了IndexError:数组,datetime字符串和numpy genfromtxt的索引太多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我的数据示例如下所示:

  file1.txt 

时间:ID:W:X:Y:Z:
2016/02/25:19:08 :41 006124189X 769 372 363 348
2016/02/25:21:41:13 006124189X 769 362 308 390
2016/02/25:22:38:20 006124189X 769 362 363 390
2016/02/26:07:37:42 006124189X 769 372 272 366
2016/02/26:08:54:34 006124189X 769 372 272 366
2016/02/26:09:57 :04 006124189X 769 372 363 371

其中第一列是datetime字符串,第二列是id由数字和字母组成,其余的只是0-10000之间的整数。



我将最终尝试根据记录的时间值绘制一些这些整数值,但是目前我只是想要正确地调用数据。我当前的代码设置:

  import numpy as np 
import matplotlib.pyplot as plt
import pylab
import datetime

#数据输入的文件名。
datafile ='file1.txt'

#要用于列标题的名称。
names = ['时间','ID','W,'X','Y','Z']

#读取从文件到数组的数据。跳过第一行。
#Datatypes used,对象为Time,String为ID,整数为其余部分。
data = np.genfromtxt(datafile,skip_header = 1,dtype =Object,S11,i8,i8,i8,i8,names = ['Time','ID','W','X' ,'Y','Z'])

#打印调用的数据来检查它的工作原理。
打印数据

#将每列指定为一个名称。
时间=数据[:,0]
ID =数据[:,1]
W =数据[:,2]
X =数据[:,3]
Y = data [:,4]
Z = data [:,5]

#打印指定列。
打印时间

我试图尽可能地确定我在



最终我想使用matplotlib添加一个这样的结局:

  plt.plot(Time,W,label ='W vs Time')
plt.xlabel('Time',fontsize = 12)
plt.ylabel 'W',fontsize = 12)
plt.show()

但是,当脚本以其当前形式运行,它提供错误:

 第15行,在< module> 
时间=数据[:,0]
IndexError:数组太多索引

每个相应列的错误是相同的,即

 第16行在< module> 
W = data [:,2]
IndexError:数组
的太多索引

以前的打印数据行将正确输出文件中的所有数据,每次显示为包含引号的2016/02/25:19:08:32的字符串。



我不确定如何在这里正确处理数据表单。如果我只是设置dtype = i8,那么我可以调用任何数据列,除了时间和ID列之外,它会记住所有行的-1值,可以理解。



我试过以下这个scipy文档,也试过这个类似的东西的堆栈页面,我无法上班。



任何帮助都不胜感激。

解决方案

数据是一个结构化的数组。检查其形状 dtype 。它已命名字段而不是列。

  ID = data ['ISBN'] 

应该工作而不是数据[:,1]



或者

  Time = data [names [0]] 
ID = data [names [1]]

genfromtxt 文档。它需要强调,如果使用名称,结果将是一个具有复合 dtype 的结构化数组,用户需要相应地访问数据。


I am having problems with correctly calling data from .txt files.

A sample of my data looks like so:

file1.txt

Time: ID: W: X: Y: Z:
2016/02/25:19:08:41 006124189X 769 372 363 348
2016/02/25:21:41:13 006124189X 769 362 308 390
2016/02/25:22:38:20 006124189X 769 362 363 390
2016/02/26:07:37:42 006124189X 769 372 272 366
2016/02/26:08:54:34 006124189X 769 372 272 366
2016/02/26:09:57:04 006124189X 769 372 363 371

Where the first column is a datetime string, the second is an id consisting of numbers and letters, the others are just integers ranging from 0-10000.

I will eventually try to plot some of these integer values against the time value recorded, but currently I am just trying to get the data to be called correctly. My current code setup:

import numpy as np
import matplotlib.pyplot as plt
import pylab
import datetime

#File name for data input.
datafile = 'file1.txt'

#Names to be used for column headers.
names = ['Time', 'ID', 'W, 'X', 'Y', 'Z']

#Read Data from file into array. Skipping the first line. 
#Datatypes used, object for Time, String for ID and Integer for the rest.
data = np.genfromtxt(datafile, skip_header=1, dtype="Object,S11,i8,i8,i8,i8", names = ['Time', 'ID', 'W', 'X', 'Y', 'Z'])

#Print the data called to check it works.
print data

#Designating each column to a name.
Time = data[:,0]
ID = data[:,1]
W = data[:,2]
X = data[:,3]
Y = data[:,4]
Z = data[:,5]

#Print designated column.
print Time

I've tried to be as conclusive as possible in what I'm trying to do.

Eventually I want to include a plot using matplotlib adding something like so to the end:

plt.plot(Time,W, label='W vs Time')
plt.xlabel('Time',fontsize=12)
plt.ylabel('W',fontsize=12) 
plt.show()

However, when the script is run in its current form it gives the error:

line 15, in <module>
Time = data[:,0]
IndexError: too many indices for array

This error is the same for each respective column i.e

line 16, in <module>
W = data[:,2]
IndexError: too many indices for array

The print Data line before, will correctly output all the data in the file, showing each time as a string like so '2016/02/25:19:08:32' including the quotes.

I am unsure how to correctly handle the data form here. If I just set dtype =i8 then I can call any of the data columns fine except the Time and ID column which will recall -1 values for all rows, understandably.

I have tried following this scipy doc, also tried this stack page of a similar thing which I couldn't get to work.

Any help is appreciated.

解决方案

data is a structured array. Check its shape and dtype. It has named fields instead of columns.

ID = data['ISBN']

Should work instead of data[:,1].

Or

Time = data[names[0]]
ID = data[names[1]]
...

Something is wrong with the genfromtxt documentation. It needs to stress that if using names the result will be a structured array with a compound dtype, and that users need to access the data accordingly.

这篇关于IndexError:数组,datetime字符串和numpy genfromtxt的索引太多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆