向 MultiIndex DataFrame/Series 添加一行 [英] adding a row to a MultiIndex DataFrame/Series

查看:107
本文介绍了向 MultiIndex DataFrame/Series 添加一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有一种等效的方法可以像使用单个索引(即使用 .ix 或 .loc)那样使用 MultiIndex 将行添加到 Series 或 DataFrame 中?

我认为自然的方式应该是这样的

row_to_add = pd.MultiIndex.from_tuples()df.ix[row_to_add] = my_row

但这会引发 KeyError.我知道我可以使用 .append(),但我会发现使用 .ix[] 或 .loc[] 会更简洁.

这里有一个例子:

<预><代码>>>>df = pd.DataFrame({'Time': [dt.datetime(2013,2,3,9,0,1), dt.datetime(2013,2,3,9,0,1)],'hsec': [1,25], 'vals': [45,46]})>>>df时间 hsec vals0 2013-02-03 09:00:01 1 451 2013-02-03 09:00:01 25 46[2 行 x 3 列]>>>df.set_index(['Time','hsec'],inplace=True)>>>ind = pd.MultiIndex.from_tuples([(dt.datetime(2013,2,3,9,0,2),0)],names=['Time','hsec'])>>>df.ix[ind] = 5回溯(最近一次调用最后一次):文件<pyshell#201>",第 1 行,在 <module> 中df.ix[ind] = 5文件C:Program FilesPython27libsite-packagespandascoreindexing.py",第 96 行,在 __setitem__indexer = self._convert_to_indexer(key, is_setter=True)文件C:Program FilesPython27libsite-packagespandascoreindexing.py",第 967 行,在 _convert_to_indexerraise KeyError('%s 不在索引中' % objarr[mask])KeyError: "[(Timestamp('2013-02-03 09:00:02', tz=None), 0L)] 不在索引中"

解决方案

您必须指定一个元组才能使多索引工作(并且您必须完全指定所有轴,例如 :是必须的)

在[26]中:df.ix[(dt.datetime(2013,2,3,9,0,2),0),:] = 5在 [27] 中:df出[27]:瓦尔斯时间hsec2013-02-03 09:00:01 1 4525 462013-02-03 09:00:02 0 5

虽然更容易重新索引和/或连接/附加新的数据帧.一般设置(使用这种放大),仅当您使用少量值进行设置时才有意义.因为这会在您执行此操作时进行复制.

I was wondering if there is an equivalent way to add a row to a Series or DataFrame with a MultiIndex as there is with a single index, i.e. using .ix or .loc?

I thought the natural way would be something like

row_to_add = pd.MultiIndex.from_tuples()
df.ix[row_to_add] = my_row

but that raises a KeyError. I know I can use .append(), but I would find it much neater to use .ix[] or .loc[].

here an example:

>>> df = pd.DataFrame({'Time': [dt.datetime(2013,2,3,9,0,1), dt.datetime(2013,2,3,9,0,1)], 'hsec': [1,25], 'vals': [45,46]})
>>> df
                 Time  hsec  vals
0 2013-02-03 09:00:01     1    45
1 2013-02-03 09:00:01    25    46

[2 rows x 3 columns]
>>> df.set_index(['Time','hsec'],inplace=True)
>>> ind = pd.MultiIndex.from_tuples([(dt.datetime(2013,2,3,9,0,2),0)],names=['Time','hsec'])
>>> df.ix[ind] = 5

Traceback (most recent call last):
  File "<pyshell#201>", line 1, in <module>
    df.ix[ind] = 5
  File "C:Program FilesPython27libsite-packagespandascoreindexing.py", line 96, in __setitem__
    indexer = self._convert_to_indexer(key, is_setter=True)
  File "C:Program FilesPython27libsite-packagespandascoreindexing.py", line 967, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "[(Timestamp('2013-02-03 09:00:02', tz=None), 0L)] not in index"

解决方案

You have to specify a tuple for the multi-indexing to work (AND you have to fully specify all axes, e.g. the : is necessary)

In [26]: df.ix[(dt.datetime(2013,2,3,9,0,2),0),:] = 5

In [27]: df
Out[27]: 
                          vals
Time                hsec      
2013-02-03 09:00:01 1       45
                    25      46
2013-02-03 09:00:02 0        5

Easier to reindex and/or concat/append a new dataframe though. Generally setting (with this kind of enlargement), only makes sense if you are doing it with a small number of values. As this makes a copy when you do this.

这篇关于向 MultiIndex DataFrame/Series 添加一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆