跨多索引的二进制操作广播 [英] Binary operation broadcasting across multiindex

查看:43
本文介绍了跨多索引的二进制操作广播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谁能解释为什么在多索引系列节目中播放不起作用?可能是熊猫(0.12.0)中的错误吗?

can anyone explain why broadcasting across a multiindexed series doesn't work? Might it be a bug in pandas (0.12.0)?

x = pd.DataFrame({'year':[1,1,1,1,2,2,2,2],
                  'country':['A','A','B','B','A','A','B','B'],
                  'prod':[1,2,1,2,1,2,1,2],
                  'val':[10,20,15,25,20,30,25,35]})
x = x.set_index(['year','country','prod']).squeeze()

y = pd.DataFrame({'year':[1,1,2,2],'prod':[1,2,1,2],
                  'mul':[10,0.1,20,0.2]})
y = y.set_index(['year','prod']).squeeze()

来自 pandas docs 我希望能够将xy相乘,并在每个country中广播y的值,从而得出:

From the description of matching/broadcasting behavior from the pandas docs I would expect to be able to multiply x and y and have the values of y broadcast across each country, giving:

>>> x.mul(y, level=['year','prod'])
    year  country  prod
1     A        1       100.0
               2       2.0
      B        1       150.0
               2       2.5
2     A        1       400.0
               2       6.0
      B        1       500.0
               2       7.0

但是,我得到了:

Exception: Join on level between two MultiIndex objects is ambiguous

(请注意,这是此主题的变体问题.)

推荐答案

正如我和@jreback在已解决此问题的问题,解决该问题的一个不错的方法是执行以下操作:

As discussed by me and @jreback in the issue opened to deal with this, a nice workaround to the problem involves doing the following:

  1. 使用unstack
  2. 将不匹配的索引级别移至列
  3. 执行乘法/除法
  4. 使用stack
  5. 将不匹配的索引级别放回
  6. 确保索引级别与以前的顺序相同.
  1. Move the non-matching index level(s) to columns using unstack
  2. Perform the multiplication/division
  3. Put the non-matching index level(s) back using stack
  4. Make sure the index levels are in the same order as they were before.

这是它的工作方式:

In [112]: x.unstack('country').mul(y, axis=0).stack('country').reorder_levels(x.index.names)
Out[112]: 
year  country  prod
1     A        1       100.0
      B        1       150.0
      A        2         2.0
      B        2         2.5
2     A        1       400.0
      B        1       500.0
      A        2         6.0
      B        2         7.0
dtype: float64

我认为这很好,并且应该非常有效.

I think that's rather good, and should be pretty efficient.

这篇关于跨多索引的二进制操作广播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆