跨多索引的二进制操作广播 [英] Binary operation broadcasting across multiindex
本文介绍了跨多索引的二进制操作广播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
谁能解释为什么在多索引系列节目中播放不起作用?可能是熊猫(0.12.0)中的错误吗?
can anyone explain why broadcasting across a multiindexed series doesn't work? Might it be a bug in pandas (0.12.0)?
x = pd.DataFrame({'year':[1,1,1,1,2,2,2,2],
'country':['A','A','B','B','A','A','B','B'],
'prod':[1,2,1,2,1,2,1,2],
'val':[10,20,15,25,20,30,25,35]})
x = x.set_index(['year','country','prod']).squeeze()
y = pd.DataFrame({'year':[1,1,2,2],'prod':[1,2,1,2],
'mul':[10,0.1,20,0.2]})
y = y.set_index(['year','prod']).squeeze()
来自 pandas docs 我希望能够将x
和y
相乘,并在每个country
中广播y
的值,从而得出:
From the description of matching/broadcasting behavior from the pandas docs I would expect to be able to multiply x
and y
and have the values of y
broadcast across each country
, giving:
>>> x.mul(y, level=['year','prod'])
year country prod
1 A 1 100.0
2 2.0
B 1 150.0
2 2.5
2 A 1 400.0
2 6.0
B 1 500.0
2 7.0
但是,我得到了:
Exception: Join on level between two MultiIndex objects is ambiguous
(请注意,这是此主题的变体问题.)
推荐答案
正如我和@jreback在已解决此问题的问题,解决该问题的一个不错的方法是执行以下操作:
As discussed by me and @jreback in the issue opened to deal with this, a nice workaround to the problem involves doing the following:
- 使用
unstack
将不匹配的索引级别移至列
- 执行乘法/除法
- 使用
stack
将不匹配的索引级别放回
- 确保索引级别与以前的顺序相同.
- Move the non-matching index level(s) to columns using
unstack
- Perform the multiplication/division
- Put the non-matching index level(s) back using
stack
- Make sure the index levels are in the same order as they were before.
这是它的工作方式:
In [112]: x.unstack('country').mul(y, axis=0).stack('country').reorder_levels(x.index.names)
Out[112]:
year country prod
1 A 1 100.0
B 1 150.0
A 2 2.0
B 2 2.5
2 A 1 400.0
B 1 500.0
A 2 6.0
B 2 7.0
dtype: float64
我认为这很好,并且应该非常有效.
I think that's rather good, and should be pretty efficient.
这篇关于跨多索引的二进制操作广播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文