如何使用 pandas 多索引行 [英] How to multi index rows using pandas

查看:58
本文介绍了如何使用 pandas 多索引行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是熊猫的新手,每当实现我的代码时,第一行索引就会在每一行重复.

I am new to pandas, whenever I implement my code, first index is repeating for each row.

我尝试过的是:

pg,ag,inc是数组

pg, ag, inc are arrays

cases=['a1','a2','a3']
data={'RED':rg,'GREEN':gg,'BLUE':bb}
stat_index=['HELO','HERE' ]
df=pd.DataFrame(data,pd.MultiIndex.from_product([cases,stat_index]),['RED','GREEN','BLUE'])
df.to_csv("OUT.CSV")

我得到的是:

             RED         GREEN       BLUE
a1  HELO    304.907     286.074     12.498
a1  HERE    508.670     509.784     94.550
a2  HELO    448.974     509.406     56.466
a2  HERE    764.727     432.084     43.462
a3  HELO    412.539     602.001     10.849
a3  HERE    321.9       603.888     78.847

我真正想要的是:

             RED         GREEN       BLUE
a1  HELO    304.907     286.074     12.498
    HERE    508.670     509.784     94.550
a2  HELO    448.974     509.406     56.466
    HERE    764.727     432.084     43.462
a3  HELO    412.539     602.001     10.849
    HERE    321.9       603.888     78.847

推荐答案

仅在确实需要时才这样做.

Don't do it, only if really need it.

这是预料之中的,因为如果将MulitIndex写入文件会被重复第一级.如果显示MultiIndex with DataFrame,则仅默认情况下不显示.但是,如果将multi_sparse更改为False,则可以检查实际数据:

It is expected, because if write MulitIndex to file is repeated first levels. If display MultiIndex with DataFrame, it is not shown by default only. But if change multi_sparse to False you can check real data:

with pd.option_context('display.multi_sparse', False):
    print (df)
             RED    GREEN    BLUE
a1 HELO  304.907  286.074  12.498
a1 HERE  508.670  509.784  94.550
a2 HELO  448.974  509.406  56.466
a2 HERE  764.727  432.084  43.462
a3 HELO  412.539  602.001  10.849
a3 HERE  321.900  603.888  78.847

主要原因是兼容性,如果将read_csvMulitIndex中的空白一起使用,则需要进行预处理.

Main reason is compatibility, if use read_csv with empty spaces in MulitIndex, need preprocessing.

但是有可能:

a = df.index.get_level_values(0)
df.index = pd.MultiIndex.from_arrays([a.where(a.duplicated(), ''),
                                      df.index.get_level_values(1)])

with pd.option_context('display.multi_sparse', False):
    print (df)
             RED    GREEN    BLUE
   HELO  304.907  286.074  12.498
a1 HERE  508.670  509.784  94.550
   HELO  448.974  509.406  56.466
a2 HERE  764.727  432.084  43.462
   HELO  412.539  602.001  10.849
a3 HERE  321.900  603.888  78.847

这篇关于如何使用 pandas 多索引行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆