查询HDF存储 [英] Querying a HDF-store
问题描述
我创建了一个hd5文件
I created a hd5 file by
hdf=pandas.HDFStore(pfad)
hdf.append('df', df, data_columns=True)
我有一个包含numpy.datetime64值的列表,这些值称为expirations,并尝试将hd5表的一部分读取到数据帧中,该数据帧的值在expiration列中的expirations [1]和expirations [0]之间.列到期条目的格式为Timestamp('2002-05-18 00:00:00').
I have a list that contains numpy.datetime64 values called expirations and try to read the portion of the hd5 table into a dataframe, that has values between expirations[1] and expirations[0] in column "expiration". Column expiration entries have the format Timestamp('2002-05-18 00:00:00').
我使用以下命令:
df=hdf.select('df', where=('expiration<expiration[1] & expiration>=expirations[0]'))
但是,我收到ValueError:无法解析x 该如何正确完成?
However, I get ValueError: Unable to parse x How should this be correctly done?
df.dtypes
Out[37]:
adjusted stock close price float64
expiration datetime64[ns]
strike int64
call put object
ask float64
bid float64
volume int64
open interest int64
unadjusted stock price float64
df.info
Out[36]:
<bound method DataFrame.info of adjusted stock close price expiration strike call put ask date
2002-05-16 5047.00 2002-05-18 4300 C 802.000
有更多的列,但它们对查询不重要.
There is more columns but they aren't of interest for the query.
推荐答案
问题已解决!
我通过
df_expirations=df.drop_duplicates(subset='expiration')
expirations=df['expiration'].values
这显然将数字格式从datetime更改为tz datetime. 我通过使用
This obviously changed the number format from datetime into tz datetime. I reingeneered this by using
expirations=df['expirations']
现在此查询正在运行: DEL DF df = hdf.select('df',where =('expiration = expirations [1]'))
Now this query is working: del df df=hdf.select('df', where=('expiration=expirations[1]'))
感谢您指出日期时间格式问题.
Thanks for pointing me on the datetime format problem.
这篇关于查询HDF存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!