解压缩并读取Scala中的gzip文件 [英] uncompress and read gzip file in scala

查看:769
本文介绍了解压缩并读取Scala中的gzip文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Scala中,如何解压缩file.gz中包含的文本以便可以对其进行处理?我希望将文件的内容存储在变量中,或者将其保存为本地文件,以便以后程序可以读取它.

In Scala, how does one uncompress the text contained in file.gz so that it can be processed? I would be happy with either having the contents of the file stored in a variable, or saving it as a local file so that it can be read in by the program after.

具体地说,我正在使用Scalding处理压缩的日志数据,但是Scalding并未定义在FileSource.scala中读取它们的方法.

Specifically, I am using Scalding to process compressed log data, but Scalding does not define a way to read them in FileSource.scala.

推荐答案

这是我的版本:

import java.io.BufferedReader
import java.io.InputStreamReader
import java.util.zip.GZIPInputStream
import java.io.FileInputStream

class BufferedReaderIterator(reader: BufferedReader) extends Iterator[String] {
  override def hasNext() = reader.ready
  override def next() = reader.readLine()
}

object GzFileIterator {
  def apply(file: java.io.File, encoding: String) = {
    new BufferedReaderIterator(
      new BufferedReader(
        new InputStreamReader(
          new GZIPInputStream(
            new FileInputStream(file)), encoding)))
  }
}

然后做:

val iterator = GzFileIterator(new java.io.File("test.txt.gz"), "UTF-8")
iterator.foreach(println)

这篇关于解压缩并读取Scala中的gzip文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆