Java 8:从Stream< Pair>中提取一对数组。 [英] Java 8: Extracting a pair of arrays out of a Stream<Pair>

查看:190
本文介绍了Java 8:从Stream< Pair>中提取一对数组。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一些使用Java 8流的代码,它可以工作。它完全符合我的需要,而且它清晰可读(函数编程很少见)。在子程序结束时,代码在自定义对类型的列表上运行:

So I have some code using Java 8 streams, and it works. It does exactly what I need it to do, and it's legible (a rarity for functional programming). Towards the end of a subroutine, the code runs over a List of a custom pair type:

// All names Hungarian-Notation-ized for SO reading
class AFooAndABarWalkIntoABar
{
    public int      foo_int;
    public BarClass bar_object;
    ....
}

List<AFooAndABarWalkIntoABar> results = ....;

此处的数据必须作为数组传递到程序的其他部分,因此它们会被复制出来:

The data here must be passed into other parts of the program as arrays, so they get copied out:

// extract either a foo or a bar from each "foo-and-bar" (fab)
int[] foo_array = results.stream()
    .mapToInt (fab -> fab.foo_int)
    .toArray();

BarClass[] bar_array = results.stream()
    .map (fab -> fab.bar_object)
    .toArray(BarClass[]::new);

完成。现在每个阵列都可以做到这一点。

And done. Now each array can go do its thing.

除了......在列表两次上的循环困扰着我的灵魂。如果我们需要跟踪更多信息,他们可能会添加第三个字段,然后必须进行第三次传递以将3元组转换为三个数组,等等。所以我在努力尝试一次性完成。

Except... that loop over the List twice bothers me in my soul. And if we ever need to track more information, they're likely going to add a third field, and then have to make a third pass to turn the 3-tuple into three arrays, etc. So I'm fooling around with trying to do it in a single pass.

分配数据结构是微不足道的,但维护消费者使用的索引似乎很可怕:

Allocating the data structures is trivial, but maintaining an index for use by the Consumer seems hideous:

int[] foo_array = new int[results.size()];
BarClass[] bar_array = new BarClass[results.size()];

// the trick is providing a stateful iterator across the array:
// - can't just use 'int', it's not effectively final
// - an actual 'final int' would be hilariously wrong
// - "all problems can be solved with a level of indirection"
class Indirection { int iterating = 0; }
final Indirection sigh = new Indirection();
// equivalent possibility is
//    final int[] disgusting = new int[]{ 0 };
// and then access disgusting[0] inside the lambda
// wash your hands after typing that code

results.stream().forEach (fab -> {
    foo_array[sigh.iterating] = fab.foo_int;
    bar_array[sigh.iterating] = fab.bar_object;
    sigh.iterating++;
});

这会产生与使用多个流循环的现有解决方案相同的数组。它大约有一半的时间是这样做的。但是迭代器间接技巧似乎难以置信地丑陋,当然也排除了并行填充数组的可能性。

This produces identical arrays as the existing solution using multiple stream loops. And it does so in about half the time, go figure. But the iterator indirection tricks seem so unspeakably ugly, and of course preclude any possibility of populating the arrays in parallel.

使用一对 ArrayList 实例,使用适当的容量创建,将让Consumer代码只为每个实例调用 add ,而不需要外部迭代器。但是ArrayList的 toArray(T [])必须再次执行存储阵列的副本,而在int情况下,顶部有装箱/拆箱那个。

Using a pair of ArrayList instances, created with appropriate capacity, would let the Consumer code simply call add for each instance, and no external iterator needed. But ArrayList's toArray(T[]) has to perform a copy of the storage array again, and in the int case there's boxing/unboxing on top of that.

(编辑:可能重复问题的答案都谈到只维护流中的索引,并使用直接数组索引到达在过滤器 / map 调用期间的实际数据,以及如果数据不是真的没有真正起作用的说明可通过直接索引访问。虽然这个问题有一个列表,但只能从井,列表#get 存在,技术上。例如,如果上面的结果集合是LinkedList,那么使用非连续索引调用O(n)获取 N次......糟糕。)

(edit: The answers to the "possible duplicate" question all talk about only maintaining the indices in a stream, and using direct array indexing to get to the actual data during filter/map calls, along with a note that it doesn't really work if the data isn't accessible by direct index. While this question has a List and is "directly indexable" only from a viewpoint of "well, List#get exists, technically". If the results collection above is a LinkedList, for example, then calling an O(n) get N times with nonconsecutive index would be... bad.)

我还缺少其他更好的可能性吗?我认为自定义收集器可能会这样做,但我无法弄清楚如何在那里维护状态,甚至从未达到过临时代码。

Are there other, better, possibilities that I'm missing? I thought a custom Collector might do it, but I can't figure out how to maintain the state there either and never even got as far as scratch code.

推荐答案

由于流的大小已知,因此没有理由再次重新发明轮子。最简单的解决方案通常是最好的解决方案。您展示的第二种方法几乎就是 - 只需使用 AtomicInteger 作为数组索引,您将实现目标 - 单次传递数据,以及可能的parralel流执行(由于 AtomicInteger )。

As the size of stream is known, there is no reason of reinventing the wheel again. The simplest solution is usually the best one. The second approach you have shown is nearly there - just use AtomicIntegeras array index and you will achieve your goal - single pass over data, and possible parralel stream execution ( due to AtomicInteger).

SO

AtomicInteger index=new AtomicInteger()
results.parallelStream().forEach (fab -> {
    int idx=index.getAndIncrement();
    foo_array[idx] = fab.foo_int;
    bar_array[idx] = fab.bar_object;
});

执行parralel的线程安全。整个集合的一次迭代

Thread safe for parralel execution. One iteratio over whole collection

这篇关于Java 8:从Stream&lt; Pair&gt;中提取一对数组。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆