使用ZipInputStreams和ZipOutpuStreams时,如何避免Scala中的可变变量? [英] How can I avoid mutable variables in Scala when using ZipInputStreams and ZipOutpuStreams?

查看:74
本文介绍了使用ZipInputStreams和ZipOutpuStreams时,如何避免Scala中的可变变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取一个zip文件,检查它是否包含一些必需文件,然后将所有有效文件写出到另一个zip文件中. java.util.zip的基本介绍有很多Java知识.并且我很想让我的代码更加Scala本机.具体来说,我想避免使用vars.这就是我所拥有的:

I'm trying to read a zip file, check that it has some required files, and then write all valid files out to another zip file. The basic introduction to java.util.zip has a lot of Java-isms and I'd love to make my code more Scala-native. Specifically, I'd like to avoid the use of vars. Here's what I have:

val fos = new FileOutputStream("new.zip");
val zipOut = new ZipOutputStream(new BufferedOutputStream(fos));

while (zipIn.available == 1) {
  val entry = zipIn.getNextEntry
  if (entryIsValid(entry)) {
    zipOut.putNewEntry(new ZipEntry("subdir/" + entry.getName())
    // read data into the data Array
    var data = Array[Byte](1024)
    var count = zipIn.read(data, 0, 1024)
    while (count != -1) {
      zipOut.write(data, 0, count)
      count = zipIn.read(data, 0, 1024)
    }
  }
  zipIn.close
}
zipOut.close

我应该补充一点,我正在使用Scala 2.7.7.

I should add that I'm using Scala 2.7.7.

推荐答案

d我认为使用旨在按命令的方式以命令式方式工作的Java类没有什么特别的错误.惯用Scala包括能够按预期使用惯用Java,即使样式确实有些冲突.

dI don't think there's anything particularly wrong with using Java classes that are designed to work in imperative fashion in the fashion they were designed. Idiomatic Scala includes being able to use idiomatic Java as it was intended, even if the styles do clash a bit.

但是,如果您想(以练习的方式,或者可能是因为它确实使逻辑稍微阐明了)以更实用的无var方式执行此操作,则可以执行此操作.在2.8中,它特别好,因此即使您使用的是2.7.7,我也会给出2.8的答案.

However, if you want--perhaps as an exercise, or perhaps because it does slightly clarify the logic--to do this in a more functional var-free way, you can do so. In 2.8, it's particularly nice, so even though you're using 2.7.7, I'll give a 2.8 answer.

首先,我们需要设置您没有完全解决的问题,但让我们假设我们有这样的事情:

First, we need to set up the problem, which you didn't entirely, but let's suppose we have something like this:

import java.io._
import java.util.zip._
import scala.collection.immutable.Stream

val fos = new FileOutputStream("new.zip")
val zipOut = new ZipOutputStream(new BufferedOutputStream(fos))
val zipIn = new ZipInputStream(new FileInputStream("old.zip"))
def entryIsValid(ze: ZipEntry) = !ze.isDirectory

现在,鉴于此,我们要复制zip文件.我们可以使用的技巧是collection.immutable.Stream中的continually方法.它的作用是为您执行一个延迟评估的循环.然后,您可以获取并过滤结果以终止并处理所需的内容.当您有想要成为迭代器的东西时使用它是一种方便的模式,但事实并非如此. (如果项本身更新,则可以在IterableIterator中使用.iterate-通常更好.)这是这种情况的应用程序,使用了两次:一次获取条目,一次读取/写入.大块数据:

Now, given this we want to copy the zip file. The trick we can use is the continually method in collection.immutable.Stream. What it does is perform a lazily-evaluated loop for you. You can then take and filter the results to terminate and process what you want. It's a handy pattern to use when you have something that you want to be an iterator, but it isn't. (If the item updates itself you can use .iterate in Iterable or Iterator--that's usually even better.) Here's the application to this case, used twice: once to get the entries, and once to read/write chunks of data:

val buffer = new Array[Byte](1024)
Stream.continually(zipIn.getNextEntry).
  takeWhile(_ != null).filter(entryIsValid).
  foreach(entry => {
    zipOut.putNextEntry(new ZipEntry("subdir/"+entry.getName))
    Stream.continually(zipIn.read(buffer)).takeWhile(_ != -1).
      foreach(count => zipOut.write(buffer,0,count))
  })
}
zipIn.close
zipOut.close

请密切注意某些行结尾处的.!我通常会在较长的一行上写此内容,但最好将其包装起来,以便您可以在此处看到所有内容.

Pay close attention to the . at the end of some lines! I would normally write this on one long line, but it's nicer to have it wrap so you can see it all here.

以防万一,我们来解开continually的用途之一.

Just in case it isn't clear, let's unpack one of the uses of continually.

Stream.continually(zipIn.read(buffer))

这要求保持多次调用zipIn.read(buffer)的次数,并存储生成的整数.

This asks to keep calling zipIn.read(buffer) for as many times as necessary, storing the integer that results.

.takeWhile(_ != -1)

这指定需要多少次,返回一个不确定长度的流,但是在遇到-1时将退出.

This specifies how many times are necessary, returning a stream of indefinite length but which will quit when it hits a -1.

.foreach(count => zipOut.write(buffer,0,count))

这将处理流,依次轮流获取每个项目(计数),然后使用它来写入缓冲区.这工作有点偷偷摸摸,因为您依靠的事实是刚刚调用了zipIn来获取流的下一个元素-如果您再次尝试执行此操作,而不是单次通过该流,它将将失败,因为buffer将被覆盖.但这没关系.

This processes the stream, taking each item in turn (the count), and using it to write the buffer. This works in a slightly sneaky way, since you rely upon the fact that zipIn has just been called to get the next element of the stream--if you tried to do this again, not on a single pass through the stream, it would fail because buffer would be overwritten. But here it's okay.

因此,它是:一种稍微更紧凑,可能更容易理解,可能更不容易理解的方法,其功能更强大(尽管仍然存在大量的副作用).相反,在2.7.7中,我实际上将以Java方式进行操作,因为Stream.continually不可用,而在这种情况下,构建自定义Iterator的开销是不值得的. (如果我打算做更多的zip文件处理并可以重用代码,那将是值得的.)

So, there it is: a slightly more compact, possibly easier to understand, possibly less easy to understand method that is more functional (though there are still side-effects galore). In 2.7.7, in contrast, I would actually do it the Java way because Stream.continually isn't available, and the overhead of building a custom Iterator isn't worth it for this one case. (It would be worth it if I was going to do more zip file processing and could reuse the code, however.)

寻找可用的归零方法有点不便,可以检测zip文件的结尾.我认为正确"的方法是等待,直到从getNextEntry返回null.考虑到这一点,我编辑了之前的代码(以前有一个takeWhile(_ => zipIn.available==1)现在是一个takeWhile(_ != null)),并在下面提供了一个基于2.7.7迭代器的版本(请注意,一旦您完成操作,主循环就会变得多么小)定义迭代器的工作,这些迭代器确实使用vars):

The looking-for-available-to-go-zero method is kind of flaky for detecting the end of the zip file. I think the "correct" way is to wait until you get a null back from getNextEntry. With that in mind, I've edited the previous code (there was a takeWhile(_ => zipIn.available==1) that is now a takeWhile(_ != null)) and provided a 2.7.7 iterator based version below (note how small the main loop is, once you get through the work of defining the iterators, which do admittedly use vars):

val buffer = new Array[Byte](1024)
class ZipIter(zis: ZipInputStream) extends Iterator[ZipEntry] {
  private var entry:ZipEntry = zis.getNextEntry
  private var cached = true
  private def cache { if (entry != null && !cached) {
    cached = true; entry = zis.getNextEntry
  }}
  def hasNext = { cache; entry != null }
  def next = {
    if (!cached) cache
    cached = false
    entry
  }
}
class DataIter(is: InputStream, ab: Array[Byte]) extends Iterator[(Int,Array[Byte])] {
  private var count = 0
  private var waiting = false
  def hasNext = { 
    if (!waiting && count != -1) { count = is.read(ab); waiting=true }
    count != -1
  }
  def next = { waiting=false; (count,ab) }
}
(new ZipIter(zipIn)).filter(entryIsValid).foreach(entry => {
  zipOut.putNextEntry(new ZipEntry("subdir/"+entry.getName))
  (new DataIter(zipIn,buffer)).foreach(cb => zipOut.write(cb._2,0,cb._1))
})
zipIn.close
zipOut.close

这篇关于使用ZipInputStreams和ZipOutpuStreams时,如何避免Scala中的可变变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆