在 pandas 中的MultiIndex df上的第二级列中替换值 [英] Replacing values in a 2nd level column on MultiIndex df in Pandas
问题描述
我正在调查这篇文章几乎解决了我的问题。但是,以我为例,我想基于df的第二级工作,但尝试不明确指定我的第一级列名。
I was looking into this post which almost solved my problem. However, in my case, I want to work based on the 2nd level of the df, but trying not to specify my 1st level column names explicitly.
借用原始数据框:
df = pd.DataFrame({('A','a'): [-1,-1,0,10,12],
('A','b'): [0,1,2,3,-1],
('B','a'): [-20,-10,0,10,20],
('B','b'): [-200,-100,0,100,200]})
##df
A B
a b a b
0 -1 0 -20 -200
1 -1 1 -10 -100
2 0 2 0 0
3 10 3 10 100
4 12 -1 20 200
我要分配 NA
到所有 a
和 b
列,其中 b< 0
。我根据以下条件选择了它们: df.xs(’b’,axis = 1,level = 1)< b
,但是后来我无法实际执行替换操作。但是,我有不同的第一级名称,因此无法显式地基于 A
和 B
进行索引,但是可能通过 df.columns.values
?
I want to assign NA
to all columns a
and b
where b<0
. I was selecting them based on: df.xs('b',axis=1,level=1)<b
, but then I cannot actually perform the replace. However, I have varying 1st level names, so the indexing there cannot be made based on A
and B
explicitly, but possibly through df.columns.values
?
所需的输出将是
##df
A B
a b a b
0 -1 0 NA NA
1 -1 1 NA NA
2 0 2 0 0
3 10 3 10 100
4 NA NA 20 200
我感谢所有提示,谢谢您。
I appreciate all tips, thank you in advance.
推荐答案
您可以使用 DataFrame.mask
与 重新索引
与由重新索引
:
You can use DataFrame.mask
with reindex
for same index and column names as original DataFrame
created by reindex
:
mask = df.xs('b',axis=1,level=1) < 0
print (mask)
A B
0 False True
1 False True
2 False False
3 False False
4 True False
print (mask.reindex(columns = df.columns, level=0))
A B
a b a b
0 False False True True
1 False False True True
2 False False False False
3 False False False False
4 True True False False
df = df.mask(mask.reindex(columns = df.columns, level=0))
print (df)
A B
a b a b
0 -1.0 0.0 NaN NaN
1 -1.0 1.0 NaN NaN
2 0.0 2.0 0.0 0.0
3 10.0 3.0 10.0 100.0
4 NaN NaN 20.0 200.0
由OP编辑:我曾在评论中问过如何考虑多种条件(例如 df.xs('b',axis = 1,level = 1)< 0
或 df.xs('b',axis = 1,level = 1)
是 NA
)。 @Jezrael友好地表示,如果我想这样做,我应该考虑
Edit by OP: I had asked in comments how to consider multiple conditions (e.g. df.xs('b',axis=1,level=1) < 0
OR df.xs('b',axis=1,level=1)
being an NA
). @Jezrael kindly indicated that if I wanted to do this, I should consider
mask=(df.xs('b',axis=1,level=1) < 0 | df.xs('b',axis=1,level=1).isnull())
这篇关于在 pandas 中的MultiIndex df上的第二级列中替换值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!