谁能解释这个奇怪的错误迭代一套? [英] Can anyone explain this bizarre bug iterating over a set?

查看:150
本文介绍了谁能解释这个奇怪的错误迭代一套?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在a_set中有一个形式的循环:。它工作不正确,因为偶尔和不一致,它会从集合中拉两次相同的东西。 (这不会导致程序崩溃。它只是得到了错误的答案。)我无法确定任何对错误行为具有确定性的东西;但是我调试它的尝试很明显,有时会发生奇怪的事情。在我最密切观察的情况下,集合中有3个项目(之前和之后),循环执行4次,一次重复其中一个项目。这些项是对我创建的类的对象的引用(更像是一个C结构)。当我将for语句更改为列表中的东西(a_set)时,不良行为消失了:

I had a loop of the form for thing in a_set:. It was working incorrectly because, occasionally and inconsistently, it would pull the same thing from the set twice. (This does not cause the program to crash. It just gets the wrong answer.) I was not able to determine anything that was deterministic about the wrong behavior; but my attempts to debug it made it very clear that the bizarreness was happening sometimes. In the cases in which I observed it most closely, there were 3 items in the set (before and after) and the loop executed 4 times, once with a repeat of one of the items. The items were references to objects of a class I had created (treated more like a C struct). The bad behavior went away when I changed the for statement to for thing in list(a_set):.

我完全无法解释错误的行为。我非常肯定循环体中没有任何东西可以导致它发生的事情发生两次或改变事物变量的值。我相当肯定循环中发生的事情不会试图影响集合的组成。此外,即使可以,我相信会导致 RuntimeError 。因为提出可能导致这种情况的假设,我完全不知所措。缺乏连续运行相同代码的可重复性尤其神秘。我尝试在更简单的场景中重新创建症状失败了。不过,为了解决一个我无法解释的问题,我会在那里留下 list()调用感到愚蠢。任何其他人的假设都会受到欢迎。我需要了解在调试时我应该尝试消除哪些类型的想法。

I am at a total loss to explain the wrong behavior. I am very certain that nothing in the body of the loop can cause what it is doing to happen twice or change the value of the thing variable. I am fairly certain that what is going on in the loop could not try to affect the composition of the set. Furthermore, even if it could, I believe that would cause a RuntimeError. I am at a complete loss for coming up with hypotheses about what could possibly be causing this. The lack of repeatability running the same code consecutively is especially mysterious. My attempts to recreate the symptom in a simpler scenario have failed. Nevertheless, I would feel silly about leaving the list() invocation in there just to solve a problem I cannot explain. Anyone else's hypothesizing would be welcome. I need ideas about what sorts of things I should be trying to eliminate in debugging it.

更新:我认为这个问题根据声称它被错误地搁置了不在话题。在这种情况下,缺乏可重复性是问题,我怀疑我缺少的语言有一些细微差别。事实上,情况确实如此,MSeifert的回答让我了解了导致它的原因。但是,它并不像他推测的那么简单,正如我在回答他的回答中所说的那样。

Update: I think this question was incorrectly put on hold based on a claim that it was off topic. The lack of reproducibility was the issue in this case, and I suspected that there was some nuance of the language that I was missing. Indeed, that does turn out to be the case, and MSeifert's answer put me on to what was causing it. However, it was not quite as simple as what he speculated, as I note in a comment on his answer.

我也通过说出集合中的对象来混淆这个问题是可变的。他们不是。它们是对属性可更改的对象的引用。 (这可以从我写的内容中推断出来,但我在一般意义上错误地使用了mutable这个词而不是Python技术意义上的。)散列的是对象的地址,与其的值无关。属性。如果这些对象引用是可变的,Python就永远不会让我把它们放在一个集合中。

I also confused the issue by saying the objects in the set were mutable. They are not. They are references to objects whose attributes are changeable. (That could have been inferred from what I wrote, but I was incorrectly using the word "mutable" in a general sense and not in the Python technical sense.) What is hashed is the address of the object, independent of the values of its attributes. Were those object references mutable, Python would never have let me put them in a set in the first place.

推荐答案

如果错误当你添加列表(a_set)时就离开了,很可能你在迭代期间改变了这个集合。一般情况下,这会抛出一个 RuntimeError ,但如果你添加了多少元素,它就不会触发:

If the error went away when you added the list(a_set) it's very likely that you changed the set during the iteration. In general this throws a RuntimeError but in case you add as many elements as you remove it doesn't trigger:

a = {1,2,3}
for item in a:
    print(item)
    a.add(item+3)  # add one item
    a.remove(item) # remove one item

打印数字 1 31 (金额实际上是一个实现细节,因此您可能会看到不同的金额)以及循环之前和之后在每次迭代开始时, set 包含 3 元素。

prints the numbers 1 to 31 (the amount is actually an implementation detail so you may see different amounts) and before and after the loop as well as at the beginning of each iteration the set contains 3 elements.

但是,如果我添加列表,则会创建原始集的副本(作为列表),并且只迭代原始集中存在的元素:

However if I add a list call it creates a copy (as list) of the original set and only iterates over the elements that were present in the original set:

a = {1,2,3}
for item in list(a):
    print(item)
    a.add(item+3)
    a.remove(item)

print(a)

打印:

1
2
3
set([4, 5, 6])   # totally changed!






在评论中你注意到你有的课程在集合中是可变的,所以即使你可能你删除并添加相同的元素它可能不再是相同的元素(从集<的角度来看/ code>)。一般情况下,你不应该将可变类放在 set 中,或者作为 dict 中的键,因为你必须真的请注意,可变性不会影响 __ hash __ __ eq __ 方法的结果。


In the comments you noted that the classes you have in the set are mutable, so even though you might think you remove and add the same element it may not be the same element anymore (from the point of view of the set). In general you shouldn't put mutable classes in a set or as keys in a dict because you have to be really careful that the mutability cannot affect the result of the __hash__ or __eq__ methods.

只是一个迭代一个看似随机数量的集合元素的例子:

Just an example that iterates over a seemingly "random" number of set elements:

class Fun(object):
    def __init__(self, value):
        self.value = value

    def __repr__(self):
        return '{self.__class__.__name__}({self.value})'.format(self=self)

    def __eq__(self, other):
        return self.value == other.value

a = {Fun(1),Fun(2),Fun(3)}
for item in a:
    print(item)
    a.add(Fun(item.value+3))
    a.remove(item)

实际上会显示随机(不是真正的随机它只取决于实例的哈希值,在这种情况下,哈希值取决于类的 id 每次运行代码时更改的对象)每次运行代码段时, Fun 对象的数量。

will actually show a "random" (not really random it just depends on the hashes of the instances and in this case the hash depends on the id of the class object which changes each time I run the code) number of Fun objects each time I run the snippet.

这篇关于谁能解释这个奇怪的错误迭代一套?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆