Java并行流:如何等待线程等待并行流完成? [英] Java parallel stream: how to wait for threads for a parallel stream to finish?

查看:871
本文介绍了Java并行流:如何等待线程等待并行流完成?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我有一个列表,可以从中获取并行流以填写地图,如下所示:

So I have a list from which I obtain a parallel stream to fill out a map, as follows:

Map<Integer, TreeNode> map = new HashMap<>();
List<NodeData> list = some_filled_list;

//Putting data from the list into the map
list.parallelStream().forEach(d -> {
                TreeNode node = new TreeNode(d);
                map.put(node.getId(), node);
            });

//print out map
map.entrySet().stream().forEach(entry -> {
     System.out.println("Processing node with ID = " + entry.getValue().getId());
                });

此代码的问题是,当输入数据"过程仍在进行时(因为它是并行的),正在打印地图,因此,地图尚未接收到列表中的所有元素.当然,在我的真实代码中,它不仅是打印出地图,还包括打印地图.我使用地图来利用O(1)查找时间.

The problem with this code is that the map is being printed out when the "putting data" process is still going on (cuz it's parallel), so as a result, map has not yet received all the elements from the list yet. Of course, in my real code, it is not just printing out the map; I use a map to take advantage of O(1) lookup time.

我的问题是:

  1. 如何使主线程等待,以便在打印出地图之前完成输入数据"?我试图将"puting data"放入线程t中,并执行t.start()t.join(),但这无济于事.

在这种情况下,也许我不应该使用并行流?列表很长,我只想利用并行性来提高效率.

Maybe I am not supposed to use parallel stream in this case? The list is long, and I just want to take advantage of the parallelism to improve efficiency.

推荐答案

使用此list.parallelStream().forEach,您将违反Stream文档中明确声明的side-effects属性.

With this list.parallelStream().forEach you are violating the side-effects property that is explicitly stated in the Stream documentation.

另外,当您说这段代码是在输入数据"过程仍在进行时(因为并行),正在打印地图.这不是真的,因为forEach是终端操作,它将等待完成,直到可以在下一行进行处理.这样,您可能会看到,因为您要收集到非线程安全的HashMap,并且某些条目可能不在该映射中...考虑一下其他方式,如果您会将来自多个线程的多个条目放在HashMap中?好吧,很多事情可能会中断,例如缺少条目,在不正确/不一致的地图上等等.

Also when you say this code is that the map is being printed out when the "putting data" process is still going on (cuz it's parallel), this is not true, as forEach is a terminal operation and it will wait to be finished, until it can go an process the next line. You might be seeing that as such, since you are collecting to a non thread-safe HashMap and some entries might not be in that map... Think about about other way, what would happen if you would put multiple entries from multiple threads in a HashMap? Well, lots of things can break, like missing entries, on incorrect/inconsistent Map, etc.

当然,将其更改为ConcurrentHashMap是可行的,因为它是线程安全的,但是尽管以安全"的方式,您仍然违反了副作用属性.

Of course, changing that to a ConcurrentHashMap would work, since it's thread-safe, but you are still violating the side-effect property, although in a "safe" way.

正确的做法是直接将collect转换为Map,而无需forEach:

The correct thing to do is to collect to a Map directly without forEach:

Map<Integer, TreeNode> map = list.parallelStream()
        .collect(Collectors.toMap(
                NodeData::getId,
                TreeNode::new
        ));

这样,即使对于并行处理,一切都很好.请注意,您需要很多(成千上万个元素),才能通过并行处理获得可衡量的性能提升.

This way, even for parallel processing, everything would be fine. Just notice that you would need lots (tens of thousands elements) to have any measurable performance increase from parallel processing.

这篇关于Java并行流:如何等待线程等待并行流完成?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆