更换大 pandas 多重索引中的值 [英] Replacing values in a pandas multi-index

查看:310
本文介绍了更换大 pandas 多重索引中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有多重索引的数据框。当第一个索引符合某些条件时,我想更改第二个索引的值。
我在这里找到了一个类似(但不同的)问题:替换MultiIndex中的值(熊猫)$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $在我的情况下,我正在处理多行,我没有能力适应我的解决方案。



我的数据的一个最小的例子如下。谢谢!

  import pandas as pd 
import numpy as np

consdf = pd。 DataFrame()

在['North','South']中的分配:
在np.arange(1,4)中的场景:
df = pd.DataFrame )
df ['mylocation'] = [mylocation]
df ['scenario'] = [场景]
df ['this'] = np.random.randint(10,100)
df ['that'] = df ['this'] * 2
df ['something else'] = df ['this'] * 3
consdf = pd.concat((consdf, df),axis = 0,ignore_index = True)

mypiv = consdf.pivot('mylocation','场景')。transpose()

level_list = ['this ','that']
#如果level 0在level_list - >将级别1设置为np.nan
mypiv.iloc [mypiv.index.get_level_values(0).isin(level_list)]。index.set_levels([np.nan],level = 1,inplace = True)

最后一行不行:我得到:



ValueError:在level 1上,标签max(2)> = level(1)的长度

 注意:此索引处于不一致状态


解决方案

IIUC你可以为级别值添加新值,然后使用高级索引 get_level_values set_levels set_labels 方法:

  len_ind = len(mypiv.loc [(level_list,)]。index.get_level_values(1))
mypiv.index.set_levels([1,2,3,np.nan] ,level = 1,inplace = True)
mypiv.index.set_labels([3] * len_ind + mypiv.index.labels [1] [len_ind:]。tolist(),level = 1,inplace = True

在[219]中:mypiv
Out [219]:
配额北南
方案
这NaN 26 46
NaN 32 67
NaN 75 30
NaN 52 92
NaN 64 134
NaN 150 60
其他1.0 78 138
2.0 96 201
3.0 225 90

注意您对其他场景的值将转换为float,因为它应该是一种类型, np.nan 具有浮点型。


I have a dataframe with a multi-index. I want to change the value of the 2nd index when certain conditions on the first index are met. I found a similar (but different) question here: Replace a value in MultiIndex (pandas) which doesn't answer my point because that was about changing a single row, and the solution passed the value of the first index (which didn't need changing), too. In my case I am dealing with multiple rows and I haven't been able to adapt that solution to my case.

A minimal example of my data is below. Thanks!

import pandas as pd
import numpy as np

consdf=pd.DataFrame()

for mylocation in ['North','South']:
    for scenario in np.arange(1,4):
        df= pd.DataFrame()
        df['mylocation'] = [mylocation]
        df['scenario']= [scenario]
        df['this'] = np.random.randint(10,100)
        df['that'] = df['this']  * 2
        df['something else']  = df['this'] * 3
        consdf=pd.concat((consdf, df ), axis=0, ignore_index=True)

mypiv = consdf.pivot('mylocation','scenario').transpose()

level_list =['this','that']
# if level 0 is in level_list --> set level 1 to np.nan
mypiv.iloc[mypiv.index.get_level_values(0).isin(level_list)].index.set_levels([np.nan], level =1, inplace=True)

The last line doesn't work: I get:

ValueError: On level 1, label max (2) >= length of level  (1). NOTE: this index is in an inconsistent state

解决方案

IIUC you could add new value to level values, and then change labels for your index, using advanced indexing, get_level_values, set_levels and set_labels methods:

len_ind = len(mypiv.loc[(level_list,)].index.get_level_values(1))
mypiv.index.set_levels([1, 2, 3, np.nan], level=1, inplace=True)
mypiv.index.set_labels([3]*len_ind + mypiv.index.labels[1][len_ind:].tolist(), level=1, inplace=True)

In [219]: mypiv
Out[219]: 
mylocation               North  South
               scenario              
this           NaN          26     46
               NaN          32     67
               NaN          75     30
that           NaN          52     92
               NaN          64    134
               NaN         150     60
something else  1.0         78    138
                2.0         96    201
                3.0        225     90

Note You values for other scenario will convert to float because it should be one type and np.nan has float type.

这篇关于更换大 pandas 多重索引中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆