如何从HDFStore检索pandas df multiindex? [英] How to retrieve pandas df multiindex from HDFStore?
本文介绍了如何从HDFStore检索pandas df multiindex?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如果是带有简单索引的DataFrame,则可以从HDFStore中检索索引,如下所示:
If DataFrame with simple index is the case, one may retrieve index from HDFStore as follows:
df = pd.DataFrame(np.random.randn(2, 3), index=list('yz'), columns=list('abc'))
df
>>> a b c
>>> y -0.181063 1.919440 1.550992
>>> z -0.701797 1.917156 0.645707
with pd.HDFStore('test.h5') as store:
store.put('df', df, format='t')
store.select_column('df', 'index')
>>> 0 y
>>> 1 z
>>> Name: index, dtype: object
如文档中所述.
但是如果使用MultiIndex,这种技巧就行不通了:
But in case with MultiIndex such trick doesn't work:
df = pd.DataFrame(np.random.randn(2, 3),
index=pd.MultiIndex.from_tuples([(0,'y'), (1, 'z')], names=['lvl0', 'lvl1']),
columns=list('abc'))
df
>>> a b c
>>> lvl0 lvl1
>>> 0 y -0.871125 0.001773 0.618647
>>> 1 z 1.001547 1.132322 -0.215681
更准确地说,它返回错误索引:
More precisely it returns wrong index:
with pd.HDFStore('test.h5') as store:
store.put('df', df, format='t')
store.select_column('df', 'index')
>>> 0 0
>>> 1 1
>>> Name: index, dtype: int64
如何检索正确的DataFrame MultiIndex?
How to retrieve correct DataFrame MultiIndex?
推荐答案
可以在指定了columns=['index']
参数的情况下使用select
:
One may use select
with columns=['index']
parameter specified:
df = pd.DataFrame(np.random.randn(2, 3),
index=pd.MultiIndex.from_tuples([(0,'y'), (1, 'z')], names=['lvl0', 'lvl1']),
columns=list('abc'))
df
>>> a b c
>>> lvl0 lvl1
>>> 0 y -0.871125 0.001773 0.618647
>>> 1 z 1.001547 1.132322 -0.215681
with pd.HDFStore('test.h5') as store:
store.put('df', df, format='t')
store.select('df', columns=['index'])
>>> lvl0 lvl1
>>> 0 y
>>> 1 z
它有效,但似乎不是 查看全文