如何将层次结构或多索引应用于 pandas 列 [英] How to apply hierarchy or multi-index to pandas columns

查看:85
本文介绍了如何将层次结构或多索引应用于 pandas 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到了许多有关如何分层排列数据帧行索引的示例,但是我试图对列进行相同操作,并且不了解语法:

I have seen lots of examples on how to arrange dataframe row indexes hierarchically, but I am trying to do the same for columns and am not understanding the syntax:

给出:

df = pd.DataFrame(np.random.randn(10,10),
                  columns=['consumption', 'voltage', 'consumption', 
                           'voltage', 'temperature', 'humidity', 'consumption', 
                           'voltage','temperature','humidity'], 
                  index= pd.date_range('20000103',periods=10))

>>> df
            consumption   voltage  consumption   voltage  temperature  \
2000-01-03    -1.327735 -1.440285     0.317122 -1.120105     1.736651   
2000-01-04     0.132531  0.646972     2.296734  0.332154    -0.541792   
2000-01-05     0.127623  0.592778     0.162096  0.107398    -0.628785   
2000-01-06    -1.441151  0.215424     0.021068  0.683085    -0.783994   
2000-01-07    -0.157848  1.566780     0.599017 -0.628216     0.500251   
2000-01-08    -0.498926  0.338771     0.400159  1.571975     0.255635   
2000-01-09     0.516618 -1.936360     0.199388 -0.110415     2.690859   
2000-01-10    -0.779012 -1.310022    -1.207503  0.095679    -0.134244   
2000-01-11     0.644262  0.068196     1.041745 -0.444408    -0.751595   
2000-01-12    -0.608046  0.506588    -1.003893  0.473716     0.211991   

            humidity  consumption   voltage  temperature  humidity  
2000-01-03  0.039869     1.875807  0.129065     0.132419  0.572678  
2000-01-04  1.997363     0.543881 -1.235036     1.155389  1.282912  
2000-01-05 -0.458992     0.371589  0.698094     0.695067 -1.095875  
2000-01-06  2.512991     0.795234  1.220327    -0.688820  0.875705  
2000-01-07  0.263855    -1.253786 -0.308674     1.000057  1.474928  
2000-01-08 -0.614560    -0.398284  1.307488    -0.002438  1.572630  
2000-01-09  0.363889     2.571522  1.048124     2.574866 -0.417247  
2000-01-10 -0.125377     1.004011  1.312716    -2.036689  0.557569  
2000-01-11 -0.818585    -0.595743  1.106869    -2.226666 -0.679508  
2000-01-12  0.705707    -0.959365  0.689911     0.498411 -0.353557 

我想做的是向列中添加层次结构索引或什至类似于标记的东西,以使它们看起来像这样:

What I would like to do is add a hierarchical index or even something akin to a tag to the columns, so that they looked something like this:

BUILDING        1
DEVICETYPE      METER                  METER                  WEATHER
DEVICEID        A                      B                      A
FIELD           consumption   voltage  consumption   voltage  temperature  \
    2000-01-03    -1.327735 -1.440285     0.317122 -1.120105     1.736651   
    2000-01-04     0.132531  0.646972     2.296734  0.332154    -0.541792   
    2000-01-05     0.127623  0.592778     0.162096  0.107398    -0.628785   
    2000-01-06    -1.441151  0.215424     0.021068  0.683085    -0.783994   
    2000-01-07    -0.157848  1.566780     0.599017 -0.628216     0.500251   
    2000-01-08    -0.498926  0.338771     0.400159  1.571975     0.255635   
    2000-01-09     0.516618 -1.936360     0.199388 -0.110415     2.690859   
    2000-01-10    -0.779012 -1.310022    -1.207503  0.095679    -0.134244   
    2000-01-11     0.644262  0.068196     1.041745 -0.444408    -0.751595   
    2000-01-12    -0.608046  0.506588    -1.003893  0.473716     0.211991   

BUILDING                  2
DEVICETYPE                METER                  WEATHER
DEVICEID                  A                      A
                humidity  consumption   voltage  temperature  humidity  
    2000-01-03  0.039869     1.875807  0.129065     0.132419  0.572678  
    2000-01-04  1.997363     0.543881 -1.235036     1.155389  1.282912  
    2000-01-05 -0.458992     0.371589  0.698094     0.695067 -1.095875  
    2000-01-06  2.512991     0.795234  1.220327    -0.688820  0.875705  
    2000-01-07  0.263855    -1.253786 -0.308674     1.000057  1.474928  
    2000-01-08 -0.614560    -0.398284  1.307488    -0.002438  1.572630  
    2000-01-09  0.363889     2.571522  1.048124     2.574866 -0.417247  
    2000-01-10 -0.125377     1.004011  1.312716    -2.036689  0.557569  
    2000-01-11 -0.818585    -0.595743  1.106869    -2.226666 -0.679508  
    2000-01-12  0.705707    -0.959365  0.689911     0.498411 -0.353557 

推荐答案

您可以使用from_arraysfrom_tuplesfrom_product类方法定义MultiIndices .这是使用from_arrays的示例:

You can define MultiIndices using the from_arrays or from_tuples or from_product classmethods. Here is an example using from_arrays:

arrays = [[1, 2]*3, ['A', 'B', 'C']*2]
columns = pd.MultiIndex.from_arrays(arrays, names=['foo', 'bar'])

df = pd.DataFrame(np.random.randn(2,6),
                  columns=columns,
                  index= pd.date_range('20000103',periods=2))

收益

In [81]: df
Out[81]: 
foo                1         2         1         2         1         2
bar                A         B         C         A         B         C
2000-01-03  1.277234 -0.899547  0.040337 -0.878752 -0.524336  0.922440
2000-01-04 -1.706797  0.450379  1.510868 -2.539827 -1.909996 -0.003851

为索引定义MultiIndex的方式与为列完全相同.

Defining a MultiIndex for the index is done in exactly the same way as for columns.

这篇关于如何将层次结构或多索引应用于 pandas 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆