将多索引排序到完整深度( pandas ) [英] Sorting Multi-Index to full depth (Pandas)

查看:74
本文介绍了将多索引排序到完整深度( pandas )的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据帧,我正在从一个csv文件中加载该数据帧,然后通过set_index方法将索引设置为其很少的列(通常为两列或三列).这个想法是然后使用几种组合键来访问数据框的某些部分,例如:

I have a dataframe which Im loading from a csv file and then setting the index to few of its columns (usually two or three) by the set_index method. The idea is to then access parts of the dataframe using several key combination, as such:

df.set_index(['fileName','phrase'])
df.ix['somePath','somePhrase']

显然,只有在将数据框的MultiIndex排序到足够的深度时,才可以使用多个键进行这种选择.在这种情况下,由于im提供了两个密钥,因此只有将数据帧MultiIndex排序到至少2的深度时,.ix操作不会失败.

Apparently, this type of selection with multiple keys is only possible if the MultiIndex of the dataframe is sorted to sufficient depth. In this case, since im supplying two keys, the .ix operation will not fail only if the dataframe MultiIndex is sorted to depth of at least 2.

由于某种原因,当我如图所示设置索引时,对我来说似乎两层都已排序,调用df.index.lexsort_depth命令将返回1,并且在尝试使用两个键进行访问时出现以下错误:

for some reason, when Im setting the index as shown, while to me it seems both layers are sorted, calling df.index.lexsort_depth command returns 1 , and I get the following error when trying to access with two keys:

MultiIndex lexsort深度1,关键是长度2

MultiIndex lexsort depth 1, key was length 2

有帮助吗?

推荐答案

目前还不清楚您要问什么.多索引文档位于此处

Its not really clear what you are asking. Multi-index docs are here

OP需要设置索引,然后就位排序

The OP needs to set the index, then sort in place

df.set_index(['fileName','phrase'],inplace=True)
df.sortlevel(inplace=True)

然后通过元组访问这些级别以获取特定结果

Then access these levels via a tuple to get a specific result

df.ix[('somePath','somePhrase')]

也许只举一个这样的玩具例子,并表明我想得到一个特定的结果.

Maybe just give a toy example like this and show I want to get a specific result.

In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'])
   ...:    .....: ,
   ...:    .....:           np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])
   ...:    .....:           ]

In [2]: df = DataFrame(randn(8, 4), index=arrays)

In [3]: df
Out[3]: 
                0         1         2         3
bar one  1.654436  0.184326 -2.337694  0.625120
    two  0.308995  1.219156 -0.906315  1.555925
baz one -0.180826 -1.951569  1.617950 -1.401658
    two  0.399151 -1.305852  1.530370 -0.132802
foo one  1.097562  0.097126  0.387418  0.106769
    two  0.465681  0.270120 -0.387639 -0.142705
qux one -0.656487 -0.154881  0.495044 -1.380583
    two  0.274045 -0.070566  1.274355  1.172247

In [4]: df.index.lexsort_depth
Out[4]: 2

In [5]: df.ix[('foo','one')]
Out[5]: 
0    1.097562
1    0.097126
2    0.387418
3    0.106769
Name: (foo, one), dtype: float64

In [6]: df.ix['foo']
Out[6]: 
            0         1         2         3
one  1.097562  0.097126  0.387418  0.106769
two  0.465681  0.270120 -0.387639 -0.142705

In [7]: df.ix[['foo']]
Out[7]: 
                0         1         2         3
foo one  1.097562  0.097126  0.387418  0.106769
    two  0.465681  0.270120 -0.387639 -0.142705

In [8]: df.sortlevel(level=1)
Out[8]: 
                0         1         2         3
bar one  1.654436  0.184326 -2.337694  0.625120
baz one -0.180826 -1.951569  1.617950 -1.401658
foo one  1.097562  0.097126  0.387418  0.106769
qux one -0.656487 -0.154881  0.495044 -1.380583
bar two  0.308995  1.219156 -0.906315  1.555925
baz two  0.399151 -1.305852  1.530370 -0.132802
foo two  0.465681  0.270120 -0.387639 -0.142705
qux two  0.274045 -0.070566  1.274355  1.172247

In [10]: df.sortlevel(level=1).index.lexsort_depth
Out[10]: 0

这篇关于将多索引排序到完整深度( pandas )的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆