使用 MultiIndex 列过滤行 [英] Filtering rows with MultiIndex columns
问题描述
在创建具有 MultiIndex 列的 DataFrame 时,似乎无法使用 df[df["AA"]>0.0]
之类的语法来选择/过滤行.例如:
When creating a DataFrame with MultiIndex columns it seems not possible to select / filter rows using syntax like df[df["AA"]>0.0]
.
For example:
import pandas as pd
import numpy as np
dates = np.asarray(pd.date_range('1/1/2000', periods=8))
_metaInfo = pd.MultiIndex.from_tuples([('AA', '[m]'), ('BB', '[m]'), ('CC', '[s]'), ('DD', '[s]')], names=['parameter','unit'])
df = pd.DataFrame(randn(8, 4), index=dates, columns=_metaInfo)
print df[df['AA']>0.0]
df["AA"]>0.0 的结果是一个索引 DataFrame iso a Timeseries.这可能会导致崩溃.
The result of df["AA"]>0.0 is an indexed DataFrame iso a Timeseries. This probably causes the crash.
当使用相同的metaInfo作为行的索引时,情况就不同了:
When using the same metaInfo as an index for the rows, the situation is different:
df1 = pandas.DataFrame(np.random.randn(4, 6), index=_metaInfo)
print df1[df1["AA"]>0.0]
产生:
[ 1.13268106 -0.06887761 0.68535054 2.49431163 -0.29349413 0.34772553]
AA 行中大于零的元素.这仅给出行 AA 的值,而不给出 DataFrame 其他列的值.
which are the elements of row AA larger than zero. This gives only the values of row AA and not of the other columns of the DataFrame.
有解决方法吗?我是否正在尝试做一些我不应该做的事情?
Is there a workaround? Am I trying to do something I shouldn't?
推荐答案
您只能选择 'AA' 列并将其用作整个 df 的过滤器.
You can select only the 'AA' column and use it as a filter on the entire df.
喜欢:
df[df[('AA','[m]')]>0.0]
parameter AA BB CC DD
unit [m] [m] [s] [s]
2000-01-01 0.600748 -1.163793 -0.982248 -0.397988
2000-01-03 1.045428 0.365353 0.049152 1.902942
2000-01-06 0.891202 0.021921 1.215515 -1.624741
2000-01-08 0.999217 -1.110213 0.257718 -0.096018
这篇关于使用 MultiIndex 列过滤行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!