numpy.genfromtxt进口替代元组数组 [英] numpy.genfromtxt imports tuples instead of arrays
问题描述
我努力学习Python和numpy的,所以请大家多多包涵。我使用numpy.genfromtxt导入CSV文件导入一个矩阵。该CSV如下所示:
<$p$p><$c$c>Time(min),Nm,Speed,Power,Distance,Rpm,Bpm,interval,Altitude,Rate,Incline,Temp,PowerBalance,LeftTorqueEffectiveness,RightTorqueEffectiveness,getLeftPedalSmoothness,getRightPedalSmoothness,getCombinedPedalSmoothness,THb,SmO2,km0.016666668,4.3555064,0,0.002,0,118,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.033333335,,4.3555064,20,0.002,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.05,4.444291,13,0.004,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
现在我运行:
matrixCsv = np.genfromtxt(开(csvFile,RB),分隔符=',',\\
missing_values = 0,skip_header = 1,DTYPE =浮动,\\
usecols =(0,2,3,4,5,6,7,8,9,10,11,17),名称= TRUE)
和我得到的:
[(0.033333335,4.3555064,20.0,0.002,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.05,4.444291,13.0,0.004,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.06666667,4.4781966,16.0,0.006,0.0,120.0,1.0,684.3,0.0,0.0,14.71,-1.0)
...
这对我看起来像封装到一个数组元组。但是,为什么元组?据我所知,numpy的阵列/矩阵必须是均匀的,这使得numpy的元组出不均匀的数据。但是,为什么我的数据不均匀?我不明白...
您感到困惑,有关如何使用 skip_header
和名称
。读取数据,并使用第一行作为变量名的正确的方法是:
在[185]:np.genfromtxt('temp.csv',分隔符=',',\\
missing_values = 0,skip_header = 0,DTYPE =浮动,\\
usecols =(0,2,3,4,5,6,7,8,9,10,11,17),名称= TRUE)
出[185]:
阵列([(0.016666668,4.3555064,0.0,0.002,0.0,118.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.033333335,4.3555064,20.0,0.002,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.05,4.444291,13.0,0.004,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)],
DTYPE = [('时间min','&LT; F8'),('速度','&LT; F8'),('动力','&LT; F8'),('距离','&LT; F8) (用rpm','&LT; F8'),('BPM','&LT; F8'),('区间','&LT; F8'),('高度','&LT; F8'),( 费率,'&LT; F8'),('倾斜','&LT; F8'),('温度','&LT; F8'),('getCombinedPedalSmoothness','&LT; F8')])
这不是元组
一个数组,但一个结构阵列
。 skip_header = 1
将导致使用数据作为名称的第一行,这可能不是你想要的(看看你是如何丢失的数据的第一行?)。
您也可以摆脱的名字和数据读入一个普通的 numpy的
阵列
。
在[186]:np.genfromtxt('temp.csv',分隔符=',',\\
missing_values = 0,skip_header = 1,DTYPE =浮动,\\
usecols =(0,2,3,4,5,6,7,8,9,10,11,17))
出[186]:
阵列([[1.66666680e-02,4.35550640e + 00,0.00000000e + 00,
2.00000000e-03,0.00000000e + 00,+ 1.18000000e 02,
1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00]
[3.33333350e-02,4.35550640e + 00,+ 2.00000000e 01,
2.00000000e-03,0.00000000e + 00,+ 1.19000000e 02,
1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00]
[5.00000000e-02,4.44429100e + 00,+ 1.30000000e 01,
4.00000000e-03,0.00000000e + 00,+ 1.19000000e 02,
1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00])
I am trying to learn Python and Numpy, so please bear with me. I am using numpy.genfromtxt to import a CSV file into a matrix. The CSV looks as follows:
Time(min),Nm,Speed,Power,Distance,Rpm,Bpm,interval,Altitude,Rate,Incline,Temp,PowerBalance,LeftTorqueEffectiveness,RightTorqueEffectiveness,getLeftPedalSmoothness,getRightPedalSmoothness,getCombinedPedalSmoothness,THb,SmO2,km
0.016666668,,4.3555064,0,0.002,0,118,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.033333335,,4.3555064,20,0.002,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.05,,4.444291,13,0.004,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
Now I run:
matrixCsv = np.genfromtxt(open(csvFile, "rb"), delimiter=',', \
missing_values=0,skip_header=1,dtype=float,\
usecols=(0,2,3,4,5,6,7,8,9,10,11,17),names=True)
and I get:
[ (0.033333335, 4.3555064, 20.0, 0.002, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
(0.05, 4.444291, 13.0, 0.004, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
(0.06666667, 4.4781966, 16.0, 0.006, 0.0, 120.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
...,
which to me looks like tuples encapsulated into an array. But why tuples? I understand that numpy arrays/matrices need to be homogeneous, and that numpy makes a tuple out of inhomogeneous data. But why is my data inhomogeneous? I do not understand...
You get confused about how to use skip_header
and names
. The right way to read the data, and use the first row as variable names is:
In [185]:
np.genfromtxt('temp.csv', delimiter=',', \
missing_values=0,skip_header=0,dtype=float,\
usecols=(0,2,3,4,5,6,7,8,9,10,11,17),names=True)
Out[185]:
array([ (0.016666668, 4.3555064, 0.0, 0.002, 0.0, 118.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0),
(0.033333335, 4.3555064, 20.0, 0.002, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0),
(0.05, 4.444291, 13.0, 0.004, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)],
dtype=[('Timemin', '<f8'), ('Speed', '<f8'), ('Power', '<f8'), ('Distance', '<f8'), ('Rpm', '<f8'), ('Bpm', '<f8'), ('interval', '<f8'), ('Altitude', '<f8'), ('Rate', '<f8'), ('Incline', '<f8'), ('Temp', '<f8'), ('getCombinedPedalSmoothness', '<f8')])
It is not a array of tuple
, but a structured array
. skip_header=1
will result using the first row of data as names, which is probably not what you want (see how you are missing the first line of data?).
You can also get rid of the names and read the data into a ordinary numpy
array
.
In [186]:
np.genfromtxt('temp.csv', delimiter=',', \
missing_values=0,skip_header=1,dtype=float,\
usecols=(0,2,3,4,5,6,7,8,9,10,11,17))
Out[186]:
array([[ 1.66666680e-02, 4.35550640e+00, 0.00000000e+00,
2.00000000e-03, 0.00000000e+00, 1.18000000e+02,
1.00000000e+00, 6.84300000e+02, 0.00000000e+00,
0.00000000e+00, 1.47100000e+01, -1.00000000e+00],
[ 3.33333350e-02, 4.35550640e+00, 2.00000000e+01,
2.00000000e-03, 0.00000000e+00, 1.19000000e+02,
1.00000000e+00, 6.84300000e+02, 0.00000000e+00,
0.00000000e+00, 1.47100000e+01, -1.00000000e+00],
[ 5.00000000e-02, 4.44429100e+00, 1.30000000e+01,
4.00000000e-03, 0.00000000e+00, 1.19000000e+02,
1.00000000e+00, 6.84300000e+02, 0.00000000e+00,
0.00000000e+00, 1.47100000e+01, -1.00000000e+00]])
这篇关于numpy.genfromtxt进口替代元组数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!