具有MultiIndex的DataFrame的.loc和.iloc [英] `.loc` and `.iloc` with MultiIndex'd DataFrame
问题描述
在为MultiIndex-ed DataFrame编制索引时,似乎.iloc
假设您正在引用索引的内部级别",而.loc
则在外部级别.
When indexing a MultiIndex-ed DataFrame, it seems like .iloc
assumes you're referencing the "inner level" of the index while .loc
looks at the outer level.
例如:
np.random.seed(123)
iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(8, 4), index=idx)
# .loc looks at the outer index:
print(df.loc['qux'])
# df.loc['two'] would throw KeyError
0 1 2 3
second
one -1.25388 -0.63775 0.90711 -1.42868
two -0.14007 -0.86175 -0.25562 -2.79859
# while .iloc looks at the inner index:
print(df.iloc[-1])
0 -0.14007
1 -0.86175
2 -0.25562
3 -2.79859
Name: (qux, two), dtype: float64
两个问题:
首先,这是为什么?这是故意设计的决定吗?
Firstly, why is this? Is it a deliberate design decision?
第二,我可以使用.iloc
引用索引的外部级别,以产生以下结果吗?我知道我可以先用get_level_values
找到索引的最后一个成员,然后再用.loc
-index找到索引的最后一个成员,但是徘徊着是否可以直接使用时髦的.iloc
语法或某些设计的现有函数来完成它专门针对这种情况.
Secondly, can I use .iloc
to reference the outer level of the index, to yield the result below? I'm aware I could first find the last member of the index with get_level_values
and then .loc
-index with that, but wandering if it can be done more directly, either with funky .iloc
syntax or some existing function designed specifically for the case.
# df.iloc[-1]
qux one 0.89071 1.75489 1.49564 1.06939
two -0.77271 0.79486 0.31427 -1.32627
推荐答案
是的,这是精心设计的决定:
.iloc
是严格的位置索引器,它不考虑结构 完全只有第一种实际行为. ....loc
做了 帐户级别的行为. [加重]
.iloc
is a strict positional indexer, it does not regard the structure at all, only the first actual behavior. ....loc
does take into account the level behavior. [emphasis added]
因此使用.iloc
不能灵活地在问题中给出期望的结果.在几个类似的问题中使用的最接近的解决方法是
So the desired result given in the question is not possible in a flexible manner with .iloc
. The closest workaround, used in several similar questions, is
print(df.loc[[df.index.get_level_values(0)[-1]]])
0 1 2 3
first second
qux one -1.25388 -0.63775 0.90711 -1.42868
two -0.14007 -0.86175 -0.25562 -2.79859
使用双括号将保留第一个索引级别.
Using double brackets will retain the first index level.
这篇关于具有MultiIndex的DataFrame的.loc和.iloc的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!