pandas 多索引数据框:创建新索引或追加到现有索引 [英] Pandas multiple index dataframe: creating new index or appending to existing index

查看:85
本文介绍了 pandas 多索引数据框:创建新索引或追加到现有索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫数据框multi_df,它具有由codecolourtextureshape值组成的多索引,如下所示:

I have a Pandas dataframe, multi_df, which has a multi-index made of the code,colour,texture and shape values as below:

import pandas as pd
import numpy as np
df = pd.DataFrame({'id' : range(1,9),
                    'code' : ['one', 'one', 'two', 'three',
                                'two', 'three', 'one', 'two'],
                    'colour': ['black', 'white','white','white',
                            'black', 'black', 'white', 'white'],
                    'texture': ['soft', 'soft', 'hard','soft','hard',
                                        'hard','hard','hard'],
                    'shape': ['round', 'triangular', 'triangular','triangular','square',
                                        'triangular','round','triangular'],
                    'amount' : np.random.randn(8)},  columns= ['id','code','colour', 'texture', 'shape', 'amount'])
multi_df = df.set_index(['code','colour','texture','shape']).sort_index()['id']
multi_df
code   colour  texture  shape     
one    black   soft     round         1
       white   hard     round         7
               soft     triangular    2
three  black   hard     triangular    6
       white   soft     triangular    4
two    black   hard     square        5
       white   hard     triangular    3
                        triangular    8
Name: id, dtype: int64

我得到了new index-new_id对.如果multi_df中已经存在new_index(组合),我想将new_id附加到现有索引中.如果new_index不存在,我想创建它并添加id值.例如:

I am given a new index - new_id couple. If the new_index (combination) already exists in the multi_df, I want to append the new_id to the existing index. If the new_index does not exist, I want to create it and add the id value. For instance:

new_id = 15
new_index = ('two','white','hard', 'triangular')
if new_index in multi_df.index:
    # APPEND TO EXISTING: multi_df[('two','white','hard', 'triangular')].append(new_id)
else:
    # CREATE NEW index and put the new_id in.

但是,我无法弄清楚添加(IF)或创建(ELSE)新索引的语法.任何帮助将是最欢迎的.

However, I cannot figure out the syntax for appending (IF) or creating (ELSE) the new index. Any help would be most welcome.

P.S:要附加,我可以看到我要添加new_id的对象是Series.但是,append()不起作用.

P.S: for appending I can see that the object that I am trying to add the new_id to is a Series. However, append() does not work..

type(multi_df[('two','white','hard', 'triangular')])
<class 'pandas.core.series.Series'>

推荐答案

append()每次都会创建一个新系列,因此它非常慢,如果需要在for循环中调用它:

append() creates a new series every time, so it's very slow, if you need call this in a for loop:

data = pd.Series(15, index=pd.MultiIndex.from_tuples([('two','white','hard', 'triangular')]))
multi_df.append(data)

这篇关于 pandas 多索引数据框:创建新索引或追加到现有索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆