在Multindex Pandas列的子级别中创建新列 [英] Creating new columns in sublevel of multindex pandas columns

查看:160
本文介绍了在Multindex Pandas列的子级别中创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个multindex栏.较高的层次是一些人,较低的层次是一些措施.我想创建一些新的列,这些列是度量(例如滚动均值)的衍生形式.我希望可以使用一些索引切片来实现这一目标,但是现在.过去我在这里找到了一些类似的问题,但它们都是旧问题,我怀疑还有更现代的pythonic解决方案.

I have a multindex column. Higher level is some humans, sublevel is some measures. I would like to create some new columns which are derivatives of measures (eg. rolling mean). I was hoping I could use some index slicing to achieve this, but alas now. I've found some similar-ish questions here in the past, but they were old questions, and I suspect there are more modern, pythonic solutions.

下面是一个玩具示例,在该示例中,我演示了我要为一列做的事情(有效),但表明如果我尝试将其应用于所有子列分组,则该方法将失败.

Below is toy example where I demonstrate what I'm trying to do for one column (which works) but shows that same method fails if I try to apply it to all of the subcolumn groupings.

index = pd.DatetimeIndex(start='2018-1-1',periods=5,freq="M")

persons = ['mike', 'dave', 'matt']
measures = ['spin', 'drag', 'bezel']
cols = pd.MultiIndex.from_product([persons, measures],names=['human', 'measure'])

xf = pd.DataFrame(index=index, data=np.random.rand(5,9), columns=cols)

idx = pd.IndexSlice

#Doing this to one specific column works
xf.loc[:,idx['mike','bezel']].rolling(window=2).mean()
xf.loc[:,idx['mike','roll']] = xf.loc[:,idx['mike','bezel']].rolling(window=2).mean()

#Trying to create a 'roll2' measure for all the humans (mike, dave,matt) doesn't work
xf.loc[:,idx[:,'roll2']] = "placeholder" #xf.loc[:,idx['mike','bezel']].rolling(window=2).mean()

xf

推荐答案

首先通过

First select columns by xs, apply rolling and create MultiIndex, last join to original:

df = xf.xs('bezel', axis=1, level=1).rolling(window=2).mean()
df.columns = [df.columns, ['roll2'] * len(df.columns)]

另一个使用rename的解决方案:

Another solution with rename:

df = (xf.xs('bezel', axis=1, level=1, drop_level=False).rolling(window=2).mean()
        .rename(columns={'bezel':'roll2'}))


print (df)
human           mike      dave      matt
               roll2     roll2     roll2
2018-01-31       NaN       NaN       NaN
2018-02-28  0.439297  0.756530  0.407606
2018-03-31  0.432513  0.436660  0.430393
2018-04-30  0.258736  0.469610  0.850996
2018-05-31  0.278869  0.698822  0.561285


xf = xf.join(df)
print (xf)
human           mike                          dave                      \
measure         spin      drag     bezel      spin      drag     bezel   
2018-01-31  0.811030  0.114535  0.326579  0.597781  0.194064  0.659795   
2018-02-28  0.774971  0.400888  0.552016  0.385539  0.582351  0.853266   
2018-03-31  0.794427  0.653428  0.313010  0.996514  0.524999  0.020055   
2018-04-30  0.307418  0.131451  0.204462  0.049346  0.198878  0.919165   
2018-05-31  0.196374  0.421594  0.353276  0.244024  0.930992  0.478479   

human           matt                          mike                dave  \
measure         spin      drag     bezel      roll     roll2     roll2   
2018-01-31  0.769308  0.657963  0.691395       NaN       NaN       NaN   
2018-02-28  0.564884  0.026864  0.123818  0.439297  0.439297  0.756530   
2018-03-31  0.755440  0.698443  0.736967  0.432513  0.432513  0.436660   
2018-04-30  0.782908  0.919064  0.965025  0.258736  0.258736  0.469610   
2018-05-31  0.414085  0.339771  0.157545  0.278869  0.278869  0.698822   

human           matt  
measure        roll2  
2018-01-31       NaN  
2018-02-28  0.407606  
2018-03-31  0.430393  
2018-04-30  0.850996  
2018-05-31  0.561285  

最后一次必要的排序MultiIndex:

xf = xf.join(df).sort_index(axis=1)

这篇关于在Multindex Pandas列的子级别中创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆