numpy.einsum中的省略号广播 [英] Ellipsis broadcasting in numpy.einsum

查看:136
本文介绍了numpy.einsum中的省略号广播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解以下原因为何时遇到问题:

我有一个数组 prefactor ,它可以是三维或六维的. 我有一个具有四个维度的数组偶极子. 偶极子的前三个维度与 prefactor 的后三个维度匹配.

由于我不知道 prefactor 的形状,因此我使用省略号来说明 prefactor 中的三个可选尺寸:

numpy.einsum('...lmn,lmno->...o', prefactor, dipoles)

(在这里的示例中,prefactor.shape为(1、1、1、160、160、128)和dipoles.shape为(160、160、128、3).执行时,出现错误: /p>

操作数1的尺寸不足以匹配广播,因此无法扩展,因为在开头和结尾均指定了爱因斯坦总和下标

但是,当我在第二个术语中也加上省略号时,它确实起作用:

numpy.einsum('...lmn,...lmno->...o', prefactor, dipoles)

我不明白为什么,因为在那里不需要省略号.有人知道这是怎么回事吗?

http://comments.gmane中提出了相同的问题.org/gmane.comp.python.numeric.general/53705 ,但尚无令人满意的答案.

解决方案

此问题存在github问题:

https://github.com/numpy/numpy/issues/2455 Einsum中索引符号的改进(Trac#18​​62)

错误情况:

einsum('ij...,j->ij...',A,B)

当前解决方法需要(空)省略号:

einsum('ij ...,j ...-> ij ...',A,B)

看起来einsum多次遍历字符串参数和ops,标识索引以及广播类型(右,左,中间,无)和op尺寸.以此构造numpy.nditer.在为nditer构造op_axes时,einsum会引发此错误.我不知道测试标准是否太严格(ibroadcast >= ndim),或者是否需要采取其他步骤来为该参数构造正确的op_axes.

https://github.com/numpy/numpy/issues/2619 显示了nditer如何用于复制einsum行为.从此工作,我可以这样复制您的计算:

prefactor = np.random.random((1, 1, 1, 160, 160, 128))
dipoles = np.random.random((160, 160, 128, 3))
x = numpy.einsum('...lmn,...lmno->...o', prefactor, dipoles)
#numpy.einsum('...lmn,lmno->...o', prefactor, dipoles)  # not work

op_axes = [[0,1,2,3,4,5,-1], [-1,-1,-1,0,1,2,3], [0,1,2,-1,-1,-1,3]]
flags = ['reduce_ok','buffered', 'external_loop', 'delay_bufalloc', 'grow_inner']
op_flags = [['readonly']]*nops + [['allocate','readwrite']]
it = np.nditer([prefactor,dipoles,None], flags, op_flags, op_axes=op_axes)
it.operands[nops][...] = 0
it.reset()
#it.debug_print()
for (x,y,w) in it:
    w[...] += x*y
print "\nnditer usage:"
print it.operands[nops] # == x
print it.operands[nops].shape # (1, 1, 1, 3)

op_axes行表示einsum'...lmn,...lmno->...o'推论得出的结果.


我正在 https://github.com/hpaulj/numpy-einsum 上探讨此问题.

>

我有一个einsum_py.py用Python代码模拟np.einsum.与该问题相关的部分是parse_subscripts(),尤其是prepare_op_axes().似乎只需要BROADCAST_RIGHT迭代(从末尾开始)就可以正确地创建op_axes,而不管下标中的省略号在哪里.它还删除了此问题的核心错误消息.

该存储库上的einsum.c.src文件具有此更改,并可以使用当前的主发行版正确编译(只需替换该文件并进行构建).它针对test_einsum.py以及该问题的示例进行了很好的测试.

我已经提交了此更改的请求请求.

I'm having a problem understanding why the following doesn't work:

I have an array prefactor that can be three-dimensional or six-dimensional. I have an array dipoles that has four dimensions. The first three dimensions of dipoles match the last three dimensions of prefactor.

As I don't know the shape of prefactor, I'm using an Ellipsis to account for the three optional dimensions in prefactor:

numpy.einsum('...lmn,lmno->...o', prefactor, dipoles)

(In the example here, prefactor.shape is (1, 1, 1, 160, 160, 128) and dipoles.shape is (160, 160, 128, 3). When executing, I get the error:

operand 1 did not have enough dimensions to match the broadcasting, and couldn't be extended because einstein sum subscripts were specified at both the start and end

It does work, however, when I add an ellipsis to the second term as well:

numpy.einsum('...lmn,...lmno->...o', prefactor, dipoles)

Just that I don't understand why, because there should be no need for an ellipsis there. Does someone know what's going on here?

The same question has been asked at http://comments.gmane.org/gmane.comp.python.numeric.general/53705 but there is no satisfactory answer yet.

解决方案

There is a github issue for this problem:

https://github.com/numpy/numpy/issues/2455 improvement of index notation in einsum (Trac #1862)

Error case:

einsum('ij...,j->ij...',A,B)

Current work around requires (empty) ellipsis:

einsum('ij...,j...->ij...',A,B)

It looks like einsum loops through the string argument and the ops several times, identifying the indexes, and broadcast types (right, left, middle, none), and op dimensions. With this it constructs an numpy.nditer. It's while constructing op_axes for the nditer that einsum raises this error. I don't know if the test criteria is too tight (ibroadcast >= ndim), or if it needs to take an additional step to construct the right op_axes for this argument.

https://github.com/numpy/numpy/issues/2619 shows how nditer can be used to replicate einsum behavior. Working from this I can replicate your calculation thus:

prefactor = np.random.random((1, 1, 1, 160, 160, 128))
dipoles = np.random.random((160, 160, 128, 3))
x = numpy.einsum('...lmn,...lmno->...o', prefactor, dipoles)
#numpy.einsum('...lmn,lmno->...o', prefactor, dipoles)  # not work

op_axes = [[0,1,2,3,4,5,-1], [-1,-1,-1,0,1,2,3], [0,1,2,-1,-1,-1,3]]
flags = ['reduce_ok','buffered', 'external_loop', 'delay_bufalloc', 'grow_inner']
op_flags = [['readonly']]*nops + [['allocate','readwrite']]
it = np.nditer([prefactor,dipoles,None], flags, op_flags, op_axes=op_axes)
it.operands[nops][...] = 0
it.reset()
#it.debug_print()
for (x,y,w) in it:
    w[...] += x*y
print "\nnditer usage:"
print it.operands[nops] # == x
print it.operands[nops].shape # (1, 1, 1, 3)

The op_axes line is indicative of what einsum deduces from '...lmn,...lmno->...o'.


I am exploring this issue on https://github.com/hpaulj/numpy-einsum.

There I have a einsum_py.py which emulates np.einsum with Python code. The part that is relevant to this issue is parse_subscripts(), and in particular prepare_op_axes(). It appears that only the BROADCAST_RIGHT iteration (starting from the end) is needed to correctly create op_axes, regardless of where ellipses are in the subscripts. It also removes the error message that is at the core of this issue.

The einsum.c.src file on that repository has this change, and compiles correctly with the current master distribution (just replace the file and build). It tests fine against test_einsum.py, as well as examples from this issue.

I've submitted a pull request for this change.

这篇关于numpy.einsum中的省略号广播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆