numpy.genfromtxt进口替代元组数组 [英] numpy.genfromtxt imports tuples instead of arrays

查看:310
本文介绍了numpy.genfromtxt进口替代元组数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我努力学习Python和numpy的,所以请大家多多包涵。我使用numpy.genfromtxt导入CSV文件导入一个矩阵。该CSV如下所示:

<$p$p><$c$c>Time(min),Nm,Speed,Power,Distance,Rpm,Bpm,interval,Altitude,Rate,Incline,Temp,PowerBalance,LeftTorqueEffectiveness,RightTorqueEffectiveness,getLeftPedalSmoothness,getRightPedalSmoothness,getCombinedPedalSmoothness,THb,SmO2,km
0.016666668,4.3555064,0,0.002,0,118,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.033333335,,4.3555064,20,0.002,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.05,4.444291,13,0.004,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72

现在我运行:

  matrixCsv = np.genfromtxt(开(cs​​vFile,RB),分隔符=',',\\
                          missing_values​​ = 0,skip_header = 1,DTYPE =浮动,\\
                          usecols =(0,2,3,4,5,6,7,8,9,10,11,17),名称= TRUE)

和我得到的:

  [(0.033333335,4.3555064,20.0,0.002,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.05,4.444291,13.0,0.004,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
(0.06666667,4.4781966,16.0,0.006,0.0,120.0,1.0,684.3,0.0,0.0,14.71,-1.0)
...

这对我看起来像封装到一个数组元组。但是,为什么元组?据我所知,numpy的阵列/矩阵必须是均匀的,这使得numpy的元组出不均匀的数据。但是,为什么我的数据不均匀?我不明白...


解决方案

您感到困惑,有关如何使用 skip_header 名称。读取数据,并使用第一行作为变量名的正确的方法是:

 在[185]:np.genfromtxt('temp.csv',分隔符=',',\\
                          missing_values​​ = 0,skip_header = 0,DTYPE =浮动,\\
                          usecols =(0,2,3,4,5,6,7,8,9,10,11,17),名称= TRUE)
出[185]:
阵列([(0.016666668,4.3555064,0.0,0.002,0.0,118.0,1.0,684.3,0.0,0.0,14.71,-1.0)
       (0.033333335,4.3555064,20.0,0.002,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)
       (0.05,4.444291,13.0,0.004,0.0,119.0,1.0,684.3,0.0,0.0,14.71,-1.0)],
      DTYPE = [('时间min','&LT; F8'),('速度','&LT; F8'),('动力','&LT; F8'),('距离','&LT; F8) (用rpm','&LT; F8'),('BPM','&LT; F8'),('区间','&LT; F8'),('高度','&LT; F8'),( 费率,'&LT; F8'),('倾斜','&LT; F8'),('温度','&LT; F8'),('getCombinedPedalSmoothness','&LT; F8')])

这不是元组一个数组,但一个结构阵列 skip_header = 1 将导致使用数据作为名称的第一行,这可能不是你想要的(看看你是如何丢失的数据的第一行?)。

您也可以摆脱的名字和数据读入一个普通的 numpy的 阵列

 在[186]:np.genfromtxt('temp.csv',分隔符=',',\\
                          missing_values​​ = 0,skip_header = 1,DTYPE =浮动,\\
                          usecols =(0,2,3,4,5,6,7,8,9,10,11,17))
出[186]:
阵列([[1.66666680e-02,4.35550640e + 00,0.00000000e + 00,
          2.00000000e-03,0.00000000e + 00,+ 1.18000000e 02,
          1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
          0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00]
       [3.33333350e-02,4.35550640e + 00,+ 2.00000000e 01,
          2.00000000e-03,0.00000000e + 00,+ 1.19000000e 02,
          1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
          0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00]
       [5.00000000e-02,4.44429100e + 00,+ 1.30000000e 01,
          4.00000000e-03,0.00000000e + 00,+ 1.19000000e 02,
          1.00000000e + 00,+ 6.84300000e 02,0.00000000e + 00,
          0.00000000e + 00,+ 1.47100000e 01,-1.00000000e + 00])

I am trying to learn Python and Numpy, so please bear with me. I am using numpy.genfromtxt to import a CSV file into a matrix. The CSV looks as follows:

Time(min),Nm,Speed,Power,Distance,Rpm,Bpm,interval,Altitude,Rate,Incline,Temp,PowerBalance,LeftTorqueEffectiveness,RightTorqueEffectiveness,getLeftPedalSmoothness,getRightPedalSmoothness,getCombinedPedalSmoothness,THb,SmO2,km
0.016666668,,4.3555064,0,0.002,0,118,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.033333335,,4.3555064,20,0.002,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72
0.05,,4.444291,13,0.004,0,119,1,684.3,0.0,0.0,14.71,50,-1.0,-1.0,-1.0,-1.0,-1.0,311.72,311.72

Now I run:

matrixCsv = np.genfromtxt(open(csvFile, "rb"), delimiter=',', \
                          missing_values=0,skip_header=1,dtype=float,\
                          usecols=(0,2,3,4,5,6,7,8,9,10,11,17),names=True)

and I get:

[ (0.033333335, 4.3555064, 20.0, 0.002, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
(0.05, 4.444291, 13.0, 0.004, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
(0.06666667, 4.4781966, 16.0, 0.006, 0.0, 120.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)
...,

which to me looks like tuples encapsulated into an array. But why tuples? I understand that numpy arrays/matrices need to be homogeneous, and that numpy makes a tuple out of inhomogeneous data. But why is my data inhomogeneous? I do not understand...

解决方案

You get confused about how to use skip_header and names. The right way to read the data, and use the first row as variable names is:

In [185]:

np.genfromtxt('temp.csv', delimiter=',', \
                          missing_values=0,skip_header=0,dtype=float,\
                          usecols=(0,2,3,4,5,6,7,8,9,10,11,17),names=True)
Out[185]:
array([ (0.016666668, 4.3555064, 0.0, 0.002, 0.0, 118.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0),
       (0.033333335, 4.3555064, 20.0, 0.002, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0),
       (0.05, 4.444291, 13.0, 0.004, 0.0, 119.0, 1.0, 684.3, 0.0, 0.0, 14.71, -1.0)], 
      dtype=[('Timemin', '<f8'), ('Speed', '<f8'), ('Power', '<f8'), ('Distance', '<f8'), ('Rpm', '<f8'), ('Bpm', '<f8'), ('interval', '<f8'), ('Altitude', '<f8'), ('Rate', '<f8'), ('Incline', '<f8'), ('Temp', '<f8'), ('getCombinedPedalSmoothness', '<f8')])

It is not a array of tuple, but a structured array. skip_header=1 will result using the first row of data as names, which is probably not what you want (see how you are missing the first line of data?).

You can also get rid of the names and read the data into a ordinary numpy array.

In [186]:

np.genfromtxt('temp.csv', delimiter=',', \
                          missing_values=0,skip_header=1,dtype=float,\
                          usecols=(0,2,3,4,5,6,7,8,9,10,11,17))
Out[186]:
array([[  1.66666680e-02,   4.35550640e+00,   0.00000000e+00,
          2.00000000e-03,   0.00000000e+00,   1.18000000e+02,
          1.00000000e+00,   6.84300000e+02,   0.00000000e+00,
          0.00000000e+00,   1.47100000e+01,  -1.00000000e+00],
       [  3.33333350e-02,   4.35550640e+00,   2.00000000e+01,
          2.00000000e-03,   0.00000000e+00,   1.19000000e+02,
          1.00000000e+00,   6.84300000e+02,   0.00000000e+00,
          0.00000000e+00,   1.47100000e+01,  -1.00000000e+00],
       [  5.00000000e-02,   4.44429100e+00,   1.30000000e+01,
          4.00000000e-03,   0.00000000e+00,   1.19000000e+02,
          1.00000000e+00,   6.84300000e+02,   0.00000000e+00,
          0.00000000e+00,   1.47100000e+01,  -1.00000000e+00]])

这篇关于numpy.genfromtxt进口替代元组数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆