将多索引排序到完整深度( pandas ) [英] Sorting Multi-Index to full depth (Pandas)
问题描述
我有一个数据帧,我正在从一个csv文件中加载该数据帧,然后通过set_index
方法将索引设置为其很少的列(通常为两列或三列).这个想法是然后使用几种组合键来访问数据框的某些部分,例如:
I have a dataframe which Im loading from a csv file and then setting the index to few of its columns (usually two or three) by the set_index
method. The idea is to then access parts of the dataframe using several key combination, as such:
df.set_index(['fileName','phrase'])
df.ix['somePath','somePhrase']
显然,只有在将数据框的MultiIndex
排序到足够的深度时,才可以使用多个键进行这种选择.在这种情况下,由于im提供了两个密钥,因此只有将数据帧MultiIndex
排序到至少2的深度时,.ix
操作不会失败.
Apparently, this type of selection with multiple keys is only possible if the MultiIndex
of the dataframe is sorted to sufficient depth. In this case, since im supplying two keys, the .ix
operation will not fail only if the dataframe MultiIndex
is sorted to depth of at least 2.
由于某种原因,当我如图所示设置索引时,对我来说似乎两层都已排序,调用df.index.lexsort_depth
命令将返回1
,并且在尝试使用两个键进行访问时出现以下错误:
for some reason, when Im setting the index as shown, while to me it seems both layers are sorted, calling df.index.lexsort_depth
command returns 1
, and I get the following error when trying to access with two keys:
MultiIndex lexsort深度1,关键是长度2
MultiIndex lexsort depth 1, key was length 2
有帮助吗?
推荐答案
目前还不清楚您要问什么.多索引文档位于此处
Its not really clear what you are asking. Multi-index docs are here
OP需要设置索引,然后就位排序
The OP needs to set the index, then sort in place
df.set_index(['fileName','phrase'],inplace=True)
df.sortlevel(inplace=True)
然后通过元组访问这些级别以获取特定结果
Then access these levels via a tuple to get a specific result
df.ix[('somePath','somePhrase')]
也许只举一个这样的玩具例子,并表明我想得到一个特定的结果.
Maybe just give a toy example like this and show I want to get a specific result.
In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'])
...: .....: ,
...: .....: np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])
...: .....: ]
In [2]: df = DataFrame(randn(8, 4), index=arrays)
In [3]: df
Out[3]:
0 1 2 3
bar one 1.654436 0.184326 -2.337694 0.625120
two 0.308995 1.219156 -0.906315 1.555925
baz one -0.180826 -1.951569 1.617950 -1.401658
two 0.399151 -1.305852 1.530370 -0.132802
foo one 1.097562 0.097126 0.387418 0.106769
two 0.465681 0.270120 -0.387639 -0.142705
qux one -0.656487 -0.154881 0.495044 -1.380583
two 0.274045 -0.070566 1.274355 1.172247
In [4]: df.index.lexsort_depth
Out[4]: 2
In [5]: df.ix[('foo','one')]
Out[5]:
0 1.097562
1 0.097126
2 0.387418
3 0.106769
Name: (foo, one), dtype: float64
In [6]: df.ix['foo']
Out[6]:
0 1 2 3
one 1.097562 0.097126 0.387418 0.106769
two 0.465681 0.270120 -0.387639 -0.142705
In [7]: df.ix[['foo']]
Out[7]:
0 1 2 3
foo one 1.097562 0.097126 0.387418 0.106769
two 0.465681 0.270120 -0.387639 -0.142705
In [8]: df.sortlevel(level=1)
Out[8]:
0 1 2 3
bar one 1.654436 0.184326 -2.337694 0.625120
baz one -0.180826 -1.951569 1.617950 -1.401658
foo one 1.097562 0.097126 0.387418 0.106769
qux one -0.656487 -0.154881 0.495044 -1.380583
bar two 0.308995 1.219156 -0.906315 1.555925
baz two 0.399151 -1.305852 1.530370 -0.132802
foo two 0.465681 0.270120 -0.387639 -0.142705
qux two 0.274045 -0.070566 1.274355 1.172247
In [10]: df.sortlevel(level=1).index.lexsort_depth
Out[10]: 0
这篇关于将多索引排序到完整深度( pandas )的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!