如何在Scalaz7 Iteratees中使用IO而不会使堆栈溢出? [英] How to use IO with Scalaz7 Iteratees without overflowing the stack?

查看:114
本文介绍了如何在Scalaz7 Iteratees中使用IO而不会使堆栈溢出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下代码(摘自此处,并修改为使用字节而不是字符行.

Consider this code (taken from here and modified to use bytes rather than lines of characters).

import java.io.{ File, InputStream, BufferedInputStream, FileInputStream }
import scalaz._, Scalaz._, effect._, iteratee.{ Iteratee => I, _ }
import std.list._

object IterateeIOExample {
  type ErrorOr[+A] = EitherT[IO, Throwable, A]

  def openStream(f: File) = IO(new BufferedInputStream(new FileInputStream(f)))
  def readByte(s: InputStream) = IO(Some(s.read()).filter(_ != -1))
  def closeStream(s: InputStream) = IO(s.close())

  def tryIO[A, B](action: IO[B]) = I.iterateeT[A, ErrorOr, B] {
    EitherT(action.catchLeft).map(r => I.sdone(r, I.emptyInput))
  }

  def enumBuffered(r: => BufferedInputStream) = new EnumeratorT[Int, ErrorOr] {
    lazy val reader = r
    def apply[A] = (s: StepT[Int, ErrorOr, A]) => s.mapCont(k =>
      tryIO(readByte(reader)) flatMap {
        case None => s.pointI
        case Some(byte) => k(I.elInput(byte)) >>== apply[A]
      })
  }

  def enumFile(f: File) = new EnumeratorT[Int, ErrorOr] {
    def apply[A] = (s: StepT[Int, ErrorOr, A]) =>
      tryIO(openStream(f)).flatMap(stream => I.iterateeT[Int, ErrorOr, A](
        EitherT(
          enumBuffered(stream).apply(s).value.run.ensuring(closeStream(stream)))))
  }

  def main(args: Array[String]) {
    val action = (
      I.consume[Int, ErrorOr, List] &=
      enumFile(new File(args(0)))).run.run
    println(action.unsafePerformIO())
  }
}

在适当大小的文件(8kb)上运行此代码会产生StackOverflowException.一些搜索结果表明,可以使用Trampoline monad而不是IO来避免该异常,但这似乎不是一个很好的解决方案-牺牲功能纯净度以完全完成程序.解决此问题的明显方法是使用IO或Trampoline作为Monad Transformer来包装另一个,但是我找不到两者中任何一个的转换器版本的实现,并且我对函数式编程专家的了解还不够知道如何编写自己的代码(了解更多关于FP的内容是该项目的目的之一,但我怀疑现在创建新的monad变压器会有点超出我的水平).我想我可以在创建,运行和返回迭代对象的结果时包装一个大的IO操作,但这感觉更像是一种解决方法,而不是解决方案.

Running this code on a decent-sized file (8kb) produces a StackOverflowException. Some searching turned up that the exception could be avoided by using the Trampoline monad instead of IO, but that doesn't seem like a great solution - sacrifice functional purity to get the program to complete at all. The obvious way to fix this is to use IO or Trampoline as a Monad Transformer to wrap the other, but I can't find an implementation of the transformer version of either of them and I'm not enough of a functional-programming guru to know how to write my own (learning more about FP is one of the purposes of this project, but I suspect creating new monad transformers is a bit above my level at the moment). I suppose I could just wrap a big IO action around creating, running and returning the result of my iteratees, but that feels like more of a workaround than a solution.

大概有些单子不能转换为单子转换器,所以我想知道是否有可能在不丢失IO或不使堆栈溢出的情况下处理大文件,如果可以,怎么办?

Presumably some monads can't be converted to monad transformers, so I'd like to know if it's possible to work with large files without dropping IO or overflowing the stack, and if so, how?

奖金问题:我想不出任何方式让iteratee发出信号,表明在处理过程中遇到了错误,除非让它返回Either,这使得编写它们变得不那么容易.上面的代码显示了如何使用EitherT来处理枚举器中的错误,但是对迭代对象如何工作?

Bonus question: I can't think of any way for an iteratee to signal that it's encountered an error while processing except to have it return Either, which makes it less easy to compose them. The code above shows how to use EitherT to handle errors in the enumerator, but how does that work for the iteratees?

推荐答案

创建异常并在代码的不同位置打印它们的堆栈长度后,我觉得您的代码没有溢出.一切似乎都以恒定的堆栈大小运行.所以我在寻找其他地方.最终,我复制了consume的实现,并添加了一些堆栈深度打印,并确认它在那里溢出了.

After creating exceptions and printing their stack length in various place of your code, I felt that your code is not overflowing. All seems to run in constant stack size. So I looked for other places. Eventually I copied the implementation of consume and added some stack depth printing and confirmed it overflowed there.

所以这溢出了:

(I.consume[Int, Id, List] &= EnumeratorT.enumStream(Stream.fill(10000)(1))).run

但是,然后我发现这不是:

But, I then found out that this doesn't:

(I.putStrTo[Int](System.out) &= EnumeratorT.enumStream(Stream.fill(10000)(1)))
  .run.unsafePerformIO()

putStrTo使用foldM,并且某种程度上不会引起溢出.因此,我想知道是否可以根据foldM来实现consume.我只是从消耗中复制了一些东西,并进行了调整,直到编译为止:

putStrTo uses foldM and somehow is not causing an overflow. So I am wondering whether consume can be implemented in terms of foldM. I just copied a few things over from consume and tweaked until it compiled:

def consume1[E, F[_]:Monad, A[_]:PlusEmpty:Applicative]: IterateeT[E, F, A[E]] = {
  I.foldM[E, F, A[E]](PlusEmpty[A].empty){ (acc: A[E], e: E) =>
    (Applicative[A].point(e) <+> acc).point[F]
  }
}

它奏效了!打印一长串整数.

And it worked! Printing a long list of ints.

这篇关于如何在Scalaz7 Iteratees中使用IO而不会使堆栈溢出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆