IndexError:数组,datetime字符串和numpy genfromtxt的索引太多 [英] IndexError: too many indices for array, datetime strings and numpy genfromtxt
问题描述
我的数据示例如下所示:
file1.txt
时间:ID:W:X:Y:Z:
2016/02/25:19:08 :41 006124189X 769 372 363 348
2016/02/25:21:41:13 006124189X 769 362 308 390
2016/02/25:22:38:20 006124189X 769 362 363 390
2016/02/26:07:37:42 006124189X 769 372 272 366
2016/02/26:08:54:34 006124189X 769 372 272 366
2016/02/26:09:57 :04 006124189X 769 372 363 371
其中第一列是datetime字符串,第二列是id由数字和字母组成,其余的只是0-10000之间的整数。
我将最终尝试根据记录的时间值绘制一些这些整数值,但是目前我只是想要正确地调用数据。我当前的代码设置:
import numpy as np
import matplotlib.pyplot as plt
import pylab
import datetime
#数据输入的文件名。
datafile ='file1.txt'
#要用于列标题的名称。
names = ['时间','ID','W,'X','Y','Z']
#读取从文件到数组的数据。跳过第一行。
#Datatypes used,对象为Time,String为ID,整数为其余部分。
data = np.genfromtxt(datafile,skip_header = 1,dtype =Object,S11,i8,i8,i8,i8,names = ['Time','ID','W','X' ,'Y','Z'])
#打印调用的数据来检查它的工作原理。
打印数据
#将每列指定为一个名称。
时间=数据[:,0]
ID =数据[:,1]
W =数据[:,2]
X =数据[:,3]
Y = data [:,4]
Z = data [:,5]
#打印指定列。
打印时间
我试图尽可能地确定我在
最终我想使用matplotlib添加一个这样的结局:
plt.plot(Time,W,label ='W vs Time')
plt.xlabel('Time',fontsize = 12)
plt.ylabel 'W',fontsize = 12)
plt.show()
但是,当脚本以其当前形式运行,它提供错误:
第15行,在< module>
时间=数据[:,0]
IndexError:数组太多索引
每个相应列的错误是相同的,即
第16行在< module>
W = data [:,2]
IndexError:数组
的太多索引
以前的打印数据行将正确输出文件中的所有数据,每次显示为包含引号的2016/02/25:19:08:32的字符串。
我不确定如何在这里正确处理数据表单。如果我只是设置dtype = i8,那么我可以调用任何数据列,除了时间和ID列之外,它会记住所有行的-1值,可以理解。
我试过以下这个scipy文档,也试过这个类似的东西的堆栈页面,我无法上班。
任何帮助都不胜感激。
数据
是一个结构化的数组。检查其形状
和 dtype
。它已命名字段而不是列。
ID = data ['ISBN']
应该工作而不是数据[:,1]
。
或者
Time = data [names [0]]
ID = data [names [1]]
genfromtxt
文档。它需要强调,如果使用名称
,结果将是一个具有复合 dtype
的结构化数组,用户需要相应地访问数据。
I am having problems with correctly calling data from .txt files.
A sample of my data looks like so:
file1.txt
Time: ID: W: X: Y: Z:
2016/02/25:19:08:41 006124189X 769 372 363 348
2016/02/25:21:41:13 006124189X 769 362 308 390
2016/02/25:22:38:20 006124189X 769 362 363 390
2016/02/26:07:37:42 006124189X 769 372 272 366
2016/02/26:08:54:34 006124189X 769 372 272 366
2016/02/26:09:57:04 006124189X 769 372 363 371
Where the first column is a datetime string, the second is an id consisting of numbers and letters, the others are just integers ranging from 0-10000.
I will eventually try to plot some of these integer values against the time value recorded, but currently I am just trying to get the data to be called correctly. My current code setup:
import numpy as np
import matplotlib.pyplot as plt
import pylab
import datetime
#File name for data input.
datafile = 'file1.txt'
#Names to be used for column headers.
names = ['Time', 'ID', 'W, 'X', 'Y', 'Z']
#Read Data from file into array. Skipping the first line.
#Datatypes used, object for Time, String for ID and Integer for the rest.
data = np.genfromtxt(datafile, skip_header=1, dtype="Object,S11,i8,i8,i8,i8", names = ['Time', 'ID', 'W', 'X', 'Y', 'Z'])
#Print the data called to check it works.
print data
#Designating each column to a name.
Time = data[:,0]
ID = data[:,1]
W = data[:,2]
X = data[:,3]
Y = data[:,4]
Z = data[:,5]
#Print designated column.
print Time
I've tried to be as conclusive as possible in what I'm trying to do.
Eventually I want to include a plot using matplotlib adding something like so to the end:
plt.plot(Time,W, label='W vs Time')
plt.xlabel('Time',fontsize=12)
plt.ylabel('W',fontsize=12)
plt.show()
However, when the script is run in its current form it gives the error:
line 15, in <module>
Time = data[:,0]
IndexError: too many indices for array
This error is the same for each respective column i.e
line 16, in <module>
W = data[:,2]
IndexError: too many indices for array
The print Data line before, will correctly output all the data in the file, showing each time as a string like so '2016/02/25:19:08:32' including the quotes.
I am unsure how to correctly handle the data form here. If I just set dtype =i8 then I can call any of the data columns fine except the Time and ID column which will recall -1 values for all rows, understandably.
I have tried following this scipy doc, also tried this stack page of a similar thing which I couldn't get to work.
Any help is appreciated.
data
is a structured array. Check its shape
and dtype
. It has named fields instead of columns.
ID = data['ISBN']
Should work instead of data[:,1]
.
Or
Time = data[names[0]]
ID = data[names[1]]
...
Something is wrong with the genfromtxt
documentation. It needs to stress that if using names
the result will be a structured array with a compound dtype
, and that users need to access the data accordingly.
这篇关于IndexError:数组,datetime字符串和numpy genfromtxt的索引太多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!