以并行方式将`BufferedReader`转换为`Stream< String>` [英] Convert `BufferedReader` to `Stream<String>` in a parallel way

查看:600
本文介绍了以并行方式将`BufferedReader`转换为`Stream< String>`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法接收 Stream< String>从 BufferedReader阅读器中流出,以便 stream 中的每个字符串代表一行 reader ,附加条件是直接提供 stream (在 reader 阅读所有内容)?我希望处理 stream 的数据,以便从 reader 获取它们以节省时间。

Is there a way to receive a Stream<String> stream out of a BufferedReader reader such that each string in stream represents one line of reader with the additional condition that stream is provided directly (before readerread everything)? I want to process the data of stream parallel to getting them from reader to save time.

编辑:我想处理与阅读并行的数据。我不想并行处理不同的行。它们应该按顺序处理。

I want to process the data parallel to reading. I don't want to process different lines parallel. They should be processed in order.

让我们举例说明我希望如何节省时间。假设我们的读者将向我们展示100行。读取一行需要2 ms,处理1 ms需要1 ms。如果我先读取所有行然后处理它们,将需要300毫秒。我想要做的是:一旦读取一行,我想处理它并且并行读取下一行。总时间将为201毫秒。

Let's make an example on how I want to save time. Let's say our reader will present 100 lines to us. It takes 2 ms to read one line and 1 ms to process it. If I first read all the lines and then process them, it will take me 300 ms. What I want to do is: As soon as a line is read I want to process it and parallel read the next line. The total time will then be 201 ms.

我不喜欢的内容 BufferedReader.lines():据我所知,当我想要处理线条时,阅读就开始了。假设我已经有了我的阅读器但是必须先进行预计算才能处理第一行。假设它们花费30毫秒。在上面的例子中,使用 reader.lines(),总时间为231毫秒或301毫秒(你能告诉我哪些时间是正确的吗?)。但是有可能在201毫秒内完成工作,因为预计算可以与读取前15行并行完成。

What I don't like about BufferedReader.lines(): As far as I understood reading starts when I want to process the lines. Let's assume I have already my reader but have to do precomputations before being able to process the first line. Let's say they cost 30 ms. In the above example the total time would then be 231 ms or 301 ms using reader.lines() (can you tell me which of those times is correct?). But it would be possible to get the job done in 201 ms, since the precomputations can be done parallel to reading the first 15 lines.

推荐答案

您可以使用 reader.lines()。parallel()。这样,您的输入将被拆分为块,并且将在块上并行执行进一步的流操作。如果进一步的操作需要很长时间,那么你可能会得到性能提升。

You can use reader.lines().parallel(). This way your input will be split into chunks and further stream operations will be performed on chunks in parallel. If further operations take significant time, then you might get performance improvement.

在你的情况下,默认启发式操作不会按你的意愿工作,我想没有现成的解决方案可以您使用单行批次。您可以编写一个自定义分裂器,它将在每行之后分割。查看 java.util.Spliterators.AbstractSpliterator 实现。可能最简单的解决方案是编写类似的东西,但将批量大小限制为 trySplit 中的一个元素,并在 tryAdvance 方法。

In your case default heuristic will not work as you want and I guess there's no ready solution which will allow you to use single line batches. You can write a custom spliterator which will split after each line. Look into java.util.Spliterators.AbstractSpliterator implementation. Probably the easiest solution would be to write something similar, but limit batch sizes to one element in trySplit and read single line in tryAdvance method.

这篇关于以并行方式将`BufferedReader`转换为`Stream&lt; String&gt;`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆