使用Python读取带有日期对象和浮点数的逗号分隔文件 [英] reading a comma-delimited file with a date object and a float with Python
本文介绍了使用Python读取带有日期对象和浮点数的逗号分隔文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个文件,其条目类似于
I have a file with entries that look like
2013-12-11 23:00:27.003293,$PAMWV,291,R,005.8,M,A*36
2013-12-11 23:00:28.000295,$PAMWV,284,R,005.5,M,A*3F
2013-12-11 23:00:29.000295,$PAMWV,273,R,004.0,M,A*33
2013-12-11 23:00:30.003310,$PAMWV,007,R,004.9,M,A*3B
考虑到分隔符实际上是一个逗号(','),这是一个经典的CSV文件。
Considering the delimiters are actually a comma (','), this is a classic CSV file.
我尝试过:
wind = loadtxt("/disk2/Wind/ws425.log.test", dtype(str,float), delimiter=',', usecols=(0,4))
ts= time.strptime(str(wind[:,0]), '%Y-%m-%d %H:%M:%S.%f')
而我得到的是
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-31-484b71dea724> in <module>()
----> 1 ts= time.strptime(str(wind[:,0]), '%Y-%m-%d %H:%M:%S.%f')
/opt/Enthought/canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/_strptime.pyc in _strptime_time(data_string, format)
452
453 def _strptime_time(data_string, format="%a %b %d %H:%M:%S %Y"):
--> 454 return _strptime(data_string, format)[0]
/opt/Enthought/canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/_strptime.pyc in _strptime(data_string, format)
323 if not found:
324 raise ValueError("time data %r does not match format %r" %
--> 325 (data_string, format))
326 if len(data_string) != found.end():
327 raise ValueError("unconverted data remains: %s" %
ValueError: time data "['2013-12-12 00:00:02.251311' '2013-12-12 00:00:03.255296'\n '2013-12-12 00:00:04.254294' ..., '2013-12-12 16:10:50.579022'\n '2013-12-12 16:10:51.607035' '2013-12-12 16:10:52.604020']" does not match format '%Y-%m-%d %H:%M:%S.%f'
我怀疑我误用了时间类型的数据类型.strptime(),但到目前为止我一直没有找到正确的类型。
I suspect I'm mis-using the data type assignment in time.strptime() but I've been unsuccessful in finding a correct type so far.
建议?
推荐答案
我没有o做类似的事情
>>> import numpy as np
>>> from datetime import datetime
>>> wind = np.loadtxt("ws425.log.test", delimiter=",", usecols=(0,4), dtype=object,
... converters={0: lambda x: datetime.strptime(x, "%Y-%m-%d %H:%M:%S.%f"),
... 4: np.float})
>>>
>>> wind
array([[datetime.datetime(2013, 12, 11, 23, 0, 27, 3293), 5.8],
[datetime.datetime(2013, 12, 11, 23, 0, 28, 295), 5.5],
[datetime.datetime(2013, 12, 11, 23, 0, 29, 295), 4.0],
[datetime.datetime(2013, 12, 11, 23, 0, 30, 3310), 4.9]], dtype=object)
不过,对于时间序列数据,我改用了 pandas
,因为它使很多事情变得更加容易:
For time series data, though, I've switched to using pandas
, because it makes a lot of things much easier:
>>> import pandas as pd
>>> df = pd.read_csv("ws425.log.test", parse_dates=[0], header=None, usecols=[0, 4])
>>> df
0 4
0 2013-12-11 23:00:27.003293 5.8
1 2013-12-11 23:00:28.000295 5.5
2 2013-12-11 23:00:29.000295 4.0
3 2013-12-11 23:00:30.003310 4.9
[4 rows x 2 columns]
>>> df[0][0]
Timestamp('2013-12-11 23:00:27.003293', tz=None)
这篇关于使用Python读取带有日期对象和浮点数的逗号分隔文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文