pandas 版本之间的MultiIndex/Shaping差异 [英] MultiIndex/Reshaping differences between Pandas versions

查看:44
本文介绍了 pandas 版本之间的MultiIndex/Shaping差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用以下代码的DataFrame:

I have a DataFrame using the following code:

import pandas as pd
import numpy as np


index = pd.DatetimeIndex(['2017-05-04', '2017-05-05', '2017-05-08', '2017-05-09',
           '2017-05-10'], dtype='datetime64[ns]',name = 'date', freq='B')
columns = pd.MultiIndex(levels=[['HSBA LN Equity', 'UCG IM Equity', 'ISP IM Equity'], ['LAST PRICE', 'HIGH', 'LOW']],
       labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])
data = np.array([[ 663.8, 672.5, 661.1, 15.97, 16.02, 15.49, 2.76, 2.768, 2.694],
          [ 658.6, 663.9, 656.0, 16.22, 16.48, 15.77, 2.842, 2.868, 2.77 ],
          [ 660.6, 664.1, 658.9, 16.01, 16.49, 15.94, 2.852, 2.898, 2.826],
          [ 664.9, 669.2, 662.5, 15.90, 16.41, 15.90, 2.848, 2.898, 2.842],
          [ 670.9, 673.4, 663.8, 16.09, 16.15, 15.59, 2.85,  2.888, 2.802]])
df = pd.DataFrame(data, columns=columns, index = index)

输出看起来像这样:

            HSBA LN Equity               UCG IM Equity                \
            LAST PRICE   HIGH    LOW    LAST PRICE   HIGH    LOW   
date                                                                  
2017-05-04          663.8  672.5  661.1         15.97  16.02  15.49   
2017-05-05          658.6  663.9  656.0         16.22  16.48  15.77   
2017-05-08          660.6  664.1  658.9         16.01  16.49  15.94   
2017-05-09          664.9  669.2  662.5         15.90  16.41  15.90   
2017-05-10          670.9  673.4  663.8         16.09  16.15  15.59   

       ISP IM Equity                
          LAST PRICE   HIGH    LOW  
date                                    
2017-05-04         2.760  2.768  2.694  
2017-05-05         2.842  2.868  2.770  
2017-05-08         2.852  2.898  2.826  
2017-05-09         2.848  2.898  2.842  
2017-05-10         2.850  2.888  2.802  

由于当我使用df = df.stack(0)时从0.19.2升级到Pandas 0.20.1,因此输出如下所示:

Since upgrading to Pandas 0.20.1 from 0.19.2 when I use df = df.stack(0) the output looks like this:

                              HIGH  LAST PRICE      LOW
date                                                   
2017-05-04 HSBA LN Equity  672.500     663.800  661.100
           UCG IM Equity     2.768       2.760    2.694
           ISP IM Equity    16.020      15.970   15.490
2017-05-05 HSBA LN Equity  663.900     658.600  656.000
           UCG IM Equity     2.868       2.842    2.770
           ISP IM Equity    16.480      16.220   15.770
2017-05-08 HSBA LN Equity  664.100     660.600  658.900
           UCG IM Equity     2.898       2.852    2.826
           ISP IM Equity    16.490      16.010   15.940
2017-05-09 HSBA LN Equity  669.200     664.900  662.500
           UCG IM Equity     2.898       2.848    2.842
           ISP IM Equity    16.410      15.900   15.900
2017-05-10 HSBA LN Equity  673.400     670.900  663.800
           UCG IM Equity     2.888       2.850    2.802
           ISP IM Equity    16.150      16.090   15.590

列已更改顺序,股票价格不再与正确的股票代码匹配.

The columns have changed order and the equity prices no longer match the correct tickers.

这是一个错误吗?

推荐答案

这绝对是一个错误,并且由于我的两个团队大量使用MultiIndex列并对其进行堆叠有点吓人.在您的问题#16323 中进行跟踪并进行了公关(#16325 ).

This was definitely a bug, and since two of my teams make heavy use of MultiIndex columns and stacking it was a little scary. Tracked it down in your issue #16323 and made a PR (#16325).

它已被合并到master中,并将成为当前计划于2017年6月13日发布的0.20.2 bugfix版本的一部分.

This has already been merged into master and will be part of the 0.20.2 bugfix release, currently scheduled for June 13, 2017.

这篇关于 pandas 版本之间的MultiIndex/Shaping差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆