解压缩并读取Scala中的gzip文件 [英] uncompress and read gzip file in scala
本文介绍了解压缩并读取Scala中的gzip文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在Scala中,如何解压缩file.gz
中包含的文本以便可以对其进行处理?我希望将文件的内容存储在变量中,或者将其保存为本地文件,以便以后程序可以读取它.
In Scala, how does one uncompress the text contained in file.gz
so that it can be processed? I would be happy with either having the contents of the file stored in a variable, or saving it as a local file so that it can be read in by the program after.
具体地说,我正在使用Scalding处理压缩的日志数据,但是Scalding并未定义在FileSource.scala
中读取它们的方法.
Specifically, I am using Scalding to process compressed log data, but Scalding does not define a way to read them in FileSource.scala
.
推荐答案
这是我的版本:
import java.io.BufferedReader
import java.io.InputStreamReader
import java.util.zip.GZIPInputStream
import java.io.FileInputStream
class BufferedReaderIterator(reader: BufferedReader) extends Iterator[String] {
override def hasNext() = reader.ready
override def next() = reader.readLine()
}
object GzFileIterator {
def apply(file: java.io.File, encoding: String) = {
new BufferedReaderIterator(
new BufferedReader(
new InputStreamReader(
new GZIPInputStream(
new FileInputStream(file)), encoding)))
}
}
然后做:
val iterator = GzFileIterator(new java.io.File("test.txt.gz"), "UTF-8")
iterator.foreach(println)
这篇关于解压缩并读取Scala中的gzip文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文