避免 Python sum 默认开始 arg 行为 [英] Avoiding Python sum default start arg behavior

查看:29
本文介绍了避免 Python sum 默认开始 arg 行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个实现 __add__ 但没有子类化 int 的 Python 对象.MyObj1 + MyObj2 工作正常,但是 sum([MyObj1, MyObj2]) 导致 TypeError,因为sum()code> 第一次尝试 0 + MyObj.为了使用 sum(),我的对象需要 __radd__ 来处理 MyObj + 0 我需要提供一个空对象作为 start 参数.有问题的对象不是设计为空的.

I am working with a Python object that implements __add__, but does not subclass int. MyObj1 + MyObj2 works fine, but sum([MyObj1, MyObj2]) led to a TypeError, becausesum() first attempts 0 + MyObj. In order to use sum(), my object needs __radd__ to handle MyObj + 0 or I need to provide an empty object as the start parameter. The object in question is not designed to be empty.

在任何人询问之前,该对象不是类似列表或类似字符串的,因此使用 join() 或 itertools 无济于事.

Before anyone asks, the object is not list-like or string-like, so use of join() or itertools would not help.

编辑详情:该模块有一个 SimpleLocation 和一个 CompoundLocation.我将Location缩写为Loc.SimpleLoc 包含一个右开区间,即 [start, end).添加 SimpleLoc 会产生一个 CompoundLoc,其中包含一个区间列表,例如[[3, 6), [10, 13)].最终用途包括迭代联合,例如[3, 4, 5, 10, 11, 12],检查长度,检查成员资格.

Edit for details: the module has a SimpleLocation and a CompoundLocation. I'll abbreviate Location to Loc. A SimpleLoc contains one right-open interval, i.e. [start, end). Adding SimpleLoc yields a CompoundLoc, which contains a list of the intervals, e.g. [[3, 6), [10, 13)]. End uses include iterating through the union, e.g. [3, 4, 5, 10, 11, 12], checking length, and checking membership.

这些数字可能相对较大(例如,小于 2^32,但通常为 2^20).间隔可能不会很长(100-2000,但可能更长).目前,仅存储端点.我现在暂时考虑尝试对 set 进行子类化,以便将位置构造为 set(xrange(start, end)).但是,添加集合将使 Python(和数学家)适合.

The numbers can be relatively large (say, smaller than 2^32 but commonly 2^20). The intervals probably won't be extremely long (100-2000, but could be longer). Currently, only the endpoints are stored. I am now tentatively thinking of attempting to subclass set such that the location is constructed as set(xrange(start, end)). However, adding sets will give Python (and mathematicians) fits.

我看过的问题:

我正在考虑两种解决方案.一种是避免 sum() 并使用此 评论.我不明白为什么 sum() 开始时将迭代的第 0 项添加到 0 而不是添加第 0 项和第 1 项(如链接注释中的循环);我希望有一个神秘的整数优化原因.

I'm considering two solutions. One is to avoid sum() and use the loop offered in this comment. I don't understand why sum() begins by adding the 0th item of the iterable to 0 rather than adding the 0th and 1st items (like the loop in the linked comment); I hope there's an arcane integer optimization reason.

我的另一个解决方案如下;虽然我不喜欢硬编码的零检查,但这是我能够使 sum() 工作的唯一方法.

My other solution is as follows; while I don't like the hard-coded zero check, it's the only way I've been able to make sum() work.

# ...
def __radd__(self, other):
    # This allows sum() to work (the default start value is zero)
    if other == 0:
        return self
    return self.__add__(other)

总而言之,还有另一种方法可以在既不能加到整数也不能为空的对象上使用 sum() 吗?

In summary, is there another way to use sum() on objects that can neither be added to integers nor be empty?

推荐答案

代替 sum,使用:

import operator
from functools import reduce
reduce(operator.add, seq)

在 Python 2 中 reduce 是内置的,所以它看起来像:

in Python 2 reduce was built-in so this looks like:

import operator
reduce(operator.add, seq)

Reduce 通常比 sum 更灵活 - 您可以提供任何二元函数,不仅是 add,您还可以可选地提供一个初始元素,而 sum 总是使用一个.

Reduce is generally more flexible than sum - you can provide any binary function, not only add, and you can optionally provide an initial element while sum always uses one.

另请注意:(警告:前面的数学咆哮)

为没有中性元素的 add w/r/t 对象提供支持从代数的角度来看有点尴尬.

Providing support for add w/r/t objects that have no neutral element is a bit awkward from the algebraic points of view.

注意所有:

  • 自然
  • 实数
  • 复数
  • N 维向量
  • NxM 矩阵
  • 字符串

与加法一起形成Monoid - 即它们是关联的并且具有某种中性元素.

together with addition form a Monoid - i.e. they are associative and have some kind of neutral element.

如果您的操作不是关联的并且没有中性元素,那么它就不会类似于"加法.因此,不要指望它与 配合得很好总和.

If your operation isn't associative and doesn't have a neutral element, then it doesn't "resemble" addition. Hence, don't expect it to work well with sum.

在这种情况下,最好使用函数或方法而不是运算符.这可能不那么令人困惑,因为您的类的用户看到它支持 +,可能会期望它以幺半群的方式运行(就像加法通常那样).

In such case, you might be better off with using a function or a method instead of an operator. This may be less confusing since the users of your class, seeing that it supports +, are likely to expect that it will behave in a monoidic way (as addition normally does).

感谢您的扩展,我现在会参考您的特定模块:

Thanks for expanding, I'll refer to your particular module now:

这里有两个概念:

  • 简单的位置,
  • 复合地点.

可以添加简单的位置确实是有道理的,但它们不会形成幺半群,因为它们的添加不满足闭包的基本属性——两个 SimpleLoc 的和不是 SimpleLoc.它通常是一个 CompoundLoc.

It indeed makes sense that simple locations could be added, but they don't form a monoid because their addition doesn't satisfy the basic property of closure - the sum of two SimpleLocs isn't a SimpleLoc. It's, generally, a CompoundLoc.

OTOH,带加法的 CompoundLocs 对我来说看起来像一个幺半群(一个可交换的幺半群,我们正在研究它):它们的总和也是一个 CompoundLoc,它们的加法是结合的、可交换的和 中性元素是一个包含零个 SimpleLocs 的空 CompoundLoc.

OTOH, CompoundLocs with addition looks like a monoid to me (a commutative monoid, while we're at it): A sum of those is a CompoundLoc too, and their addition is associative, commutative and the neutral element is an empty CompoundLoc that contains zero SimpleLocs.

如果您同意我的观点(并且上述内容与您的实现相符),那么您将能够使用 sum 如下:

If you agree with me (and the above matches your implementation), then you'll be able to use sum as following:

sum( [SimpleLoc1, SimpleLoc2, SimpleLoc3], start=ComplexLoc() )

确实,这个似乎有效.

我现在暂时考虑尝试对 set 进行子类化,以便将位置构造为 set(xrange(start, end)).但是,添加集合将使 Python(和数学家)适合.

I am now tentatively thinking of attempting to subclass set such that the location is constructed as set(xrange(start, end)). However, adding sets will give Python (and mathematicians) fits.

嗯,位置是一些数字集合,所以在它们上面抛出一个类似集合的接口是有意义的(所以 __contains__, __iter__, __len__,也许 __or__ 作为 + 的别名,__and__ 作为产品等).

Well, locations are some sets of numbers, so it makes sense to throw a set-like interface on top of them (so __contains__, __iter__, __len__, perhaps __or__ as an alias of +, __and__ as the product, etc).

至于 xrange 的构造,你真的需要它吗?如果您知道您正在存储一组间隔,那么您可能会通过坚持 [start, end) 对的表示来节省空间.如果您觉得有帮助,您可以引入一个实用方法,该方法接受任意整数序列并将其转换为最佳的 SimpleLocCompoundLoc.

As for construction from xrange, do you really need it? If you know that you're storing sets of intervals, then you're likely to save space by sticking to your representation of [start, end) pairs. You could throw in an utility method that takes an arbitrary sequence of integers and translates it to an optimal SimpleLoc or CompoundLoc if you feel it's going to help.

这篇关于避免 Python sum 默认开始 arg 行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆