查询HDF存储 [英] Querying a HDF-store

查看:95
本文介绍了查询HDF存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个hd5文件

I created a hd5 file by

hdf=pandas.HDFStore(pfad)
hdf.append('df', df, data_columns=True)

我有一个包含numpy.datetime64值的列表,这些值称为expirations,并尝试将hd5表的一部分读取到数据帧中,该数据帧的值在expiration列中的expirations [1]和expirations [0]之间.列到期条目的格式为Timestamp('2002-05-18 00:00:00').

I have a list that contains numpy.datetime64 values called expirations and try to read the portion of the hd5 table into a dataframe, that has values between expirations[1] and expirations[0] in column "expiration". Column expiration entries have the format Timestamp('2002-05-18 00:00:00').

我使用以下命令:

df=hdf.select('df', where=('expiration<expiration[1] & expiration>=expirations[0]'))

但是,我收到ValueError:无法解析x 该如何正确完成?

However, I get ValueError: Unable to parse x How should this be correctly done?

df.dtypes
Out[37]: 
adjusted stock close price           float64
expiration                    datetime64[ns]
strike                                 int64
call put                              object
ask                                  float64
bid                                  float64
volume                                 int64
open interest                          int64
unadjusted stock price               float64

df.info
Out[36]: 
<bound method DataFrame.info of             adjusted stock close price expiration  strike call put      ask  date                                                                          
2002-05-16                     5047.00 2002-05-18    4300        C  802.000   

有更多的列,但它们对查询不重要.

There is more columns but they aren't of interest for the query.

推荐答案

问题已解决!

我通过

 df_expirations=df.drop_duplicates(subset='expiration')
 expirations=df['expiration'].values

这显然将数字格式从datetime更改为tz datetime. 我通过使用

This obviously changed the number format from datetime into tz datetime. I reingeneered this by using

 expirations=df['expirations']

现在此查询正在运行: DEL DF df = hdf.select('df',where =('expiration = expirations [1]'))

Now this query is working: del df df=hdf.select('df', where=('expiration=expirations[1]'))

感谢您指出日期时间格式问题.

Thanks for pointing me on the datetime format problem.

这篇关于查询HDF存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆