在Scala中逐行读取和处理文件 [英] Concurrent reading and processing file line by line in Scala
问题描述
假设我需要应用两个函数 f:String => A
和 g:A => B
到一个大文本文件中的每一行,最终创建一个 B
的列表。
由于文件较大, f
和 g
是昂贵的我想使处理并发。我可以使用并行集合并执行类似 io.Source.fromFile(data.txt)。getLines.toList.par.map(l => g(f(l))
,但不会同时执行读取文件 f
和 g
。 >
这个例子中实现并发的最好方法是什么?
可在上使用
地图>
:
val futures = io.Source.fromFile(fileName).getLines.map {s => Future {stringToA(s)} .map(aToB}} .toIndexedSeq
val结果= futures.map {Await.result(_,10秒)}
//或者:
val results = Await.result(Future.sequence(futures),10秒)
Suppose I need to apply two functions f: String => A
and g: A => B
to each line in a large text file to create eventually a list of B
.
Since the file is large and f
and g
are expensive I would like to make the processing concurrent. I can use "parallel collections" and do something like io.Source.fromFile("data.txt").getLines.toList.par.map(l => g(f(l))
but it does not execute reading the file, f
, and g
concurrently.
What is the best way to implement concurrency in this example?
You can use map
on Future
:
val futures = io.Source.fromFile(fileName).getLines.map{ s => Future{ stringToA(s) }.map{ aToB } }.toIndexedSeq
val results = futures.map{ Await.result(_, 10 seconds) }
// alternatively:
val results = Await.result(Future.sequence(futures), 10 seconds)
这篇关于在Scala中逐行读取和处理文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!