在 Scala 中读取和处理文件 [英] Read and process files in scala

查看:37
本文介绍了在 Scala 中读取和处理文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何读取扩展名为 7z 的文件中存在的所有文件的内容.假设我有带有 part1.csv 和 part2.csv 的 abc.7z 和带有 part3.csv 和 part4.csv 的 xyz.7z.

How do I read contents of all files present in a file with a 7z extension. Let's say I have abc.7z with part1.csv and part2.csv and xyz.7z with part3.csv and part4.csv.

我想读取位于 abc.7z 中的 part1.csv 和 part2.csv 以及位于 xyz.7z 中的 part3.csv 和 part4.csv.

I want to read contents of part1.csv and part2.csv which are in abc.7z and also part3.csv and part4.csv which are in xyz.7z.

我已经尝试过但不知何故无法在 scala 中正确执行,感谢您的帮助!

I have tried but somehow unable to do it correctly in scala, appreciate any help!

推荐答案

以下是一种方法.它遗漏了很多错误处理和边缘情况,但展示了如何做到这一点.

Here is one approach how you could do it. It misses a lot of error handling and edge cases but show how this can be done.

基本上,您需要向 sbt 添加以下依赖项:

Basically you will need to add following dependencies to your sbt:

  "org.apache.commons" % "commons-compress" % "1.16.1",
  "org.tukaani" % "xz" % "1.8"

我只使用了非常简单的文件:

I just used very simple files:

part1.cv

name, value
part1, 1

part2.cv

name, value
part2, 2

part3.cv

name, value
part3, 3

part4.cv

name, value
part4, 4

然后按照您的描述将它们分发到 abc.7zxyz.7z 文件

And then distributed them into abc.7z and xyz.7z files as you described

这是一个非常简单的代码:

Here is a very simple code:

import org.apache.commons.compress.archivers.sevenz.SevenZFile
import scala.collection.JavaConverters._

object CompressionTest extends App {

  def loadCsvLinesFromZFile(compressedFile: String, fileName: String): Vector[String] = {
    val zFile = new SevenZFile(new File(compressedFile))

    zFile.getEntries.asScala.find { entry ⇒
      // internally zFile keeps last file with call to getNextEntry
      // it's a bit ugly in scala terms
      zFile.getNextEntry
      !entry.isDirectory && entry.getName == fileName
    }.fold(Vector.empty[String]){ csv ⇒
      val content = new Array[Byte](csv.getSize.toInt)
      zFile.read(content, 0, content.length)
      new String(content).split("\n").toVector
    }
  }

  val allOutput = (loadCsvLinesFromZFile("abc.7z", "part1.csv") ++
  loadCsvLinesFromZFile("abc.7z", "part2.csv") ++
  loadCsvLinesFromZFile("xyz.7z", "part3.csv") ++
  loadCsvLinesFromZFile("xyz.7z", "part4.csv")).mkString("\n")

  println(allOutput)
}

这给了我以下输出:

name, value
part1, 1
name, value
part2, 2
name, value
part3, 3
name, value
part4, 4

我希望这会有所帮助,至少可以帮助您入门.

I hope this helps, at least to get you started.

这篇关于在 Scala 中读取和处理文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆