可迭代的集合,可以在迭代期间进行变异 [英] Iterable collection that can be mutated during iteration

查看:144
本文介绍了可迭代的集合,可以在迭代期间进行变异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java中是否存在可以迭代的Java集合数据结构(以及C#),具有以下属性:

Is there a collection data structure in Java (and C# if you know) that can be iterated over, with the following properties:


  • 可以删除当前元素而不影响当前迭代器(已经启动的迭代器的迭代的其余部分)。

  • 可以添加新元素,但也不会影响当前迭代器 - 当前迭代器的迭代仍在继续时,不包括在迭代值中。在我的情况下,每次迭代只会添加一个新元素,但是在从迭代中获取新的迭代器之前不应该看到任何元素。

  • 元素的顺序无关紧要。

实际上,有一个传入列表和一个传出的项目列表。传入列表被迭代,一些被复制到新列表。在迭代期间,可以将一些新元素添加到新列表中。迭代结束后,旧的传入列表将替换为新的传出列表。整个过程本身就在一个循环中。

In effect, there is an incoming list and an outgoing list of items. The incoming list is iterated and some are copied to a new list. Some new elements can be added to the new list during iteration. After the iteration is over, the old incoming list is replaced with the new outgoing list. This whole process is itself within a loop.

因此,与具有这些添加/删除属性的元素相比,每次将元素复制到新构造的集合对象似乎效率低下。

So, copying the elements to a newly constructed collection object each time seems inefficient compared to one that had these add/remove properties.

我有点想到某种队列,让我预览当前项目然后要么出列或者出列,然后转到下一项。我可以在队列的头部添加更多项目,但是不会看到它们,因为我正在走向终点。一个双向链表可能有这些属性,对吗?

I was kind of thinking of some kind of queue that lets me preview the current item and then either dequeue it or not, and move onto the next item. And I can add more items to the head of the queue but won’t see them because I’m moving toward the end. A doubly linked list could have these properties, right?

如果你真的想知道它的用途,那就是把第二个大我的答案中的代码块。

If you really want to know what it’s for, it’s to soup up the second large code block in an answer of mine.

推荐答案

在java中有 CopyOnWriteArrayList 可以执行您想要的操作:每次更改任何内容时,它都会生成后备数组的副本。但这确实意味着一旦开始迭代,任何迭代都会一成不变,因此您可以随意删除/添加到底层集合,而不会影响任何正在运行的迭代器。

In java there is the CopyOnWriteArrayList which does what you want: It makes a copy of the backing array every time you change anything. But that does mean any iteration is 'set in stone' once you begin iteration, and therfore you can remove/add to the underlying collection at will without affecting any running iterators whatsoever.

您还可以构建具有此行为的自己的集合类型。这是一个3班轮:

You can also build your own collection type that has this behaviour. It'd be a 3 liner:

public class ConstantIterationArrayList<T> extends ArrayList<T> {
    public Iterator<T> iterator() {
        return new ArrayList<T>(this).iterator();
    }
}

(上面列出了该列表的副本然后给你一个副本的迭代器,这样可以方便地确保对这个列表的任何修改对迭代器都没有影响。)

(The above makes a copy of the list and then gives you an iterator for the copy, thus conveniently ensuring any modification to this list has absolutely no effect on that iterator).

这是你问题的真正问题:

Here's the real problem with your question:

以上内容将不时制作基础数据存储的副本(上面的代码片段每次都是迭代器时都会这样做。 CopyOnWriteArrayList 每次调用 remove() add()时都这样做。 复制基础数据存储操作需要 O(n)时间,因为对于大于两倍的列表,它需要两倍的时间。

The above will make copies of the underlying data store from time to time (my snippet above does so every time you make an iterator. CopyOnWriteArrayList does so every time you call remove() or add()). The operation 'copy the underlying data store' takes O(n) time, as in, it takes twice as long for a list that is twice as large.

ArrayList 通常具有 remove()操作的属性,除非您要删除元素在列表末尾或非常接近列表末尾的是 O(n)操作:如果列表的大小是两倍,则从列表中删除元素需要两倍的时间。

ArrayList in general has the property that the remove() operation, unless you are removing an element at or very close to the end of the list, is an O(n) operation: Removing an element from a list takes twice as long if the list is twice as large.

幸运的是,现代CPU具有相当大的缓存,并且可以在缓存页面内以极快的速度运行。这转化为:尽管复制数据感觉效率低下,但实际上,只要支持数组适合页面左右,它就比基于 > LinkedList 语义。我们谈论的是多达~1000个元素给予或接受。 (注意,一般来说,你对 LinkedList 所做的几乎所有事情都是 O(n),其中 ArrayList 适用于现代CPU架构, LinkedList 往往做得很差。重点是: LinkedList 很少是正确的答案!)

Fortunately, modern CPUs have sizable caches and can work exceedingly fast within a cache page. Which translates to: Despite the fact that copying data over feels inefficient, in practice, as long as the backing array fits within a page or so, it's much faster than data stores based on LinkedList semantics. We're talking about up to ~1000 elements give or take. (Note, in general, almost everything you do to a LinkedList is O(n) and where ArrayList tends to work well with modern CPU architecture, LinkedList tends to do very poorly. The point being: LinkedList is very rarely the right answer either!)

所以,如果你在这个列表中的项目不超过1000个,我会继续 CopyOnWriteArrayList 或我上面为你写的自定义类。

So, if you have no more than ~1000 items in this list, I'd go ahead with CopyOnWriteArrayList or the custom class I wrote for you above.

但是,如果你有更多 ArrayList 不是这里使用的正确数据存储。即使你现在忘记了你不断的迭代需求;在大型数组列表上调用 remove()是一个坏主意(除非非常接近列表的末尾)。在这种情况下,我会精确地描述您需要对此数据类型执行哪些操作以及确切需要快速执行哪些操作,并且一旦您有完整列表,请尝试找到完全符合您需求的集合类型,并且在(可能的)情况下,没有特定的存在是一个完美的匹配,自己做一个。如上所述,当您必须滚动自己的数据类型时,通常最好让大部分工作由现有数据类型完成,因此要么扩展现有数据类型,要么封装一个。

However, if you have more than that, ArrayList is not the right data store to use here. Even if you forget about your constant iteration needs for now; calling remove() on large array lists is a bad idea (unless removing very close to the end of the list). In this case I'd sketch out precisely which operations you need to do on this data type and exactly which ones really need to be fast, and once you have a complete list, try to find a collection type which fits your needs exactly, and in the (likely) case nothing specific exists that is a perfect match, make one yourself. Like above, when you have to roll your own data type its usually a good idea to let the bulk of the work be done by existing data types, so either extend an existing one, or encapsulate one.

这篇关于可迭代的集合,可以在迭代期间进行变异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆