在 pandas 中的MultiIndex df上的第二级列中替换值 [英] Replacing values in a 2nd level column on MultiIndex df in Pandas

查看:93
本文介绍了在 pandas 中的MultiIndex df上的第二级列中替换值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在调查这篇文章几乎解决了我的问题。但是,以我为例,我想基于df的第二级工作,但尝试不明确指定我的第一级列名。

I was looking into this post which almost solved my problem. However, in my case, I want to work based on the 2nd level of the df, but trying not to specify my 1st level column names explicitly.

借用原始数据框:

df = pd.DataFrame({('A','a'): [-1,-1,0,10,12],
                   ('A','b'): [0,1,2,3,-1],
                   ('B','a'): [-20,-10,0,10,20],
                   ('B','b'): [-200,-100,0,100,200]})

##df
    A   B
    a   b   a   b
0   -1  0   -20 -200
1   -1  1   -10 -100
2   0   2   0   0
3   10  3   10  100
4   12  -1  20  200

我要分配 NA 到所有 a b 列,其中 b< 0 。我根据以下条件选择了它们: df.xs(’b’,axis = 1,level = 1)< b ,但是后来我无法实际执行替换操作。但是,我有不同的第一级名称,因此无法显式地基于 A B 进行索引,但是可能通过 df.columns.values

I want to assign NA to all columns a and b where b<0. I was selecting them based on: df.xs('b',axis=1,level=1)<b, but then I cannot actually perform the replace. However, I have varying 1st level names, so the indexing there cannot be made based on A and B explicitly, but possibly through df.columns.values?

所需的输出将是

##df
    A   B
    a   b   a   b
0   -1  0   NA  NA
1   -1  1   NA  NA
2   0   2   0   0
3   10  3   10  100
4   NA  NA  20  200

我感谢所有提示,谢谢您。

I appreciate all tips, thank you in advance.

推荐答案

您可以使用 DataFrame.mask 重新索引 与由DataFrame 相同的索引和列名/生成/ pa ndas.DataFrame.reindex.html rel = nofollow noreferrer> 重新索引

You can use DataFrame.mask with reindex for same index and column names as original DataFrame created by reindex:

mask = df.xs('b',axis=1,level=1) < 0
print (mask)
       A      B
0  False   True
1  False   True
2  False  False
3  False  False
4   True  False

print (mask.reindex(columns = df.columns, level=0))
       A             B       
       a      b      a      b
0  False  False   True   True
1  False  False   True   True
2  False  False  False  False
3  False  False  False  False
4   True   True  False  False

df = df.mask(mask.reindex(columns = df.columns, level=0))
print (df)
      A          B       
      a    b     a      b
0  -1.0  0.0   NaN    NaN
1  -1.0  1.0   NaN    NaN
2   0.0  2.0   0.0    0.0
3  10.0  3.0  10.0  100.0
4   NaN  NaN  20.0  200.0

由OP编辑:我曾在评论中问过如何考虑多种条件(例如 df.xs('b',axis = 1,level = 1)< 0 df.xs('b',axis = 1,level = 1) NA )。 @Jezrael友好地表示,如果我想这样做,我应该考虑

Edit by OP: I had asked in comments how to consider multiple conditions (e.g. df.xs('b',axis=1,level=1) < 0 OR df.xs('b',axis=1,level=1) being an NA). @Jezrael kindly indicated that if I wanted to do this, I should consider

mask=(df.xs('b',axis=1,level=1) < 0 | df.xs('b',axis=1,level=1).isnull())

这篇关于在 pandas 中的MultiIndex df上的第二级列中替换值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆