使用Python读取带有日期对象和浮点数的逗号分隔文件 [英] reading a comma-delimited file with a date object and a float with Python

查看:119
本文介绍了使用Python读取带有日期对象和浮点数的逗号分隔文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件,其条目类似于

I have a file with entries that look like

2013-12-11 23:00:27.003293,$PAMWV,291,R,005.8,M,A*36
2013-12-11 23:00:28.000295,$PAMWV,284,R,005.5,M,A*3F
2013-12-11 23:00:29.000295,$PAMWV,273,R,004.0,M,A*33
2013-12-11 23:00:30.003310,$PAMWV,007,R,004.9,M,A*3B

考虑到分隔符实际上是一个逗号(','),这是一个经典的CSV文件。

Considering the delimiters are actually a comma (','), this is a classic CSV file.

我尝试过:

wind = loadtxt("/disk2/Wind/ws425.log.test", dtype(str,float), delimiter=',', usecols=(0,4))
ts= time.strptime(str(wind[:,0]), '%Y-%m-%d %H:%M:%S.%f')

而我得到的是

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-31-484b71dea724> in <module>()
----> 1 ts= time.strptime(str(wind[:,0]), '%Y-%m-%d %H:%M:%S.%f')

/opt/Enthought/canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/_strptime.pyc in _strptime_time(data_string, format)
    452 
    453 def _strptime_time(data_string, format="%a %b %d %H:%M:%S %Y"):
--> 454     return _strptime(data_string, format)[0]

/opt/Enthought/canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/_strptime.pyc in _strptime(data_string, format)
    323     if not found:
    324         raise ValueError("time data %r does not match format %r" %
--> 325                          (data_string, format))
    326     if len(data_string) != found.end(): 
    327         raise ValueError("unconverted data remains: %s" %

ValueError: time data "['2013-12-12 00:00:02.251311' '2013-12-12 00:00:03.255296'\n     '2013-12-12 00:00:04.254294' ..., '2013-12-12 16:10:50.579022'\n '2013-12-12    16:10:51.607035' '2013-12-12 16:10:52.604020']" does not match format '%Y-%m-%d %H:%M:%S.%f'

我怀疑我误用了时间类型的数据类型.strptime(),但到目前为止我一直没有找到正确的类型。

I suspect I'm mis-using the data type assignment in time.strptime() but I've been unsuccessful in finding a correct type so far.

建议?

推荐答案

我没有o做类似的事情

>>> import numpy as np
>>> from datetime import datetime
>>> wind = np.loadtxt("ws425.log.test", delimiter=",", usecols=(0,4), dtype=object,
...                   converters={0: lambda x: datetime.strptime(x, "%Y-%m-%d %H:%M:%S.%f"),
...                               4: np.float})
>>> 
>>> wind
array([[datetime.datetime(2013, 12, 11, 23, 0, 27, 3293), 5.8],
       [datetime.datetime(2013, 12, 11, 23, 0, 28, 295), 5.5],
       [datetime.datetime(2013, 12, 11, 23, 0, 29, 295), 4.0],
       [datetime.datetime(2013, 12, 11, 23, 0, 30, 3310), 4.9]], dtype=object)






不过,对于时间序列数据,我改用了 pandas ,因为它使很多事情变得更加容易:


For time series data, though, I've switched to using pandas, because it makes a lot of things much easier:

>>> import pandas as pd
>>> df = pd.read_csv("ws425.log.test", parse_dates=[0], header=None, usecols=[0, 4])
>>> df
                           0    4
0 2013-12-11 23:00:27.003293  5.8
1 2013-12-11 23:00:28.000295  5.5
2 2013-12-11 23:00:29.000295  4.0
3 2013-12-11 23:00:30.003310  4.9

[4 rows x 2 columns]
>>> df[0][0]
Timestamp('2013-12-11 23:00:27.003293', tz=None)

这篇关于使用Python读取带有日期对象和浮点数的逗号分隔文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆