Scala：如何遍历流/迭代器将结果收集到几个不同的集合中 [英] Scala: how to traverse stream/iterator collecting results into several different collections

查看：536 发布时间：2018/11/15 22:24:25 scala iterator traversal scala-collections collect

本文介绍了Scala：如何遍历流/迭代器将结果收集到几个不同的集合中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在浏览的日志文件太大而无法容纳到内存中并收集2种类型的表达式，下面的迭代代码片段有哪些更好的功能替代？

I'm going through log file that is too big to fit into memory and collecting 2 type of expressions, what is better functional alternative to my iterative snippet below?

def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)]={
  val lines : Iterator[String] = io.Source.fromFile(file).getLines()

  val logins: mutable.Map[String, String] = new mutable.HashMap[String, String]()
  val errors: mutable.ListBuffer[(String, String)] = mutable.ListBuffer.empty

  for (line <- lines){
    line match {
      case errorPat(date,ip)=> errors.append((ip,date))
      case loginPat(date,user,ip,id) =>logins.put(ip, id)
      case _ => ""
    }
  }

  errors.toList.map(line => (logins.getOrElse(line._1,"none") + " " + line._1,line._2))
}

推荐答案

这是一个可能的解决方案：

Here is a possible solution:

def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String,String)] = {
  val lines = Source.fromFile(file).getLines
  val (err, log) = lines.collect {
        case errorPat(inf, ip) => (Some((ip, inf)), None)
        case loginPat(_, _, ip, id) => (None, Some((ip, id)))
      }.toList.unzip
  val ip2id = log.flatten.toMap
  err.collect{ case Some((ip,inf)) => (ip2id.getOrElse(ip,"none") + "" + ip, inf) }
}

这篇关于Scala：如何遍历流/迭代器将结果收集到几个不同的集合中的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Scala：如何遍历流/迭代器将结果收集到几个不同的集合中 [英] Scala: how to traverse stream/iterator collecting results into several different collections

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Scala：如何遍历流/迭代器将结果收集到几个不同的集合中 [英] Scala: how to traverse stream/iterator collecting results into several different collections

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭