在Scala中实现“收益"的首选方法是什么? [英] What is the preferred way to implement 'yield' in Scala?

查看:90
本文介绍了在Scala中实现“收益"的首选方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为博士学位研究编写代码,并开始使用Scala.我经常不得不做文字处理.我已经习惯了Python,它的'yield'语句对于在通常是不规则结构的大型文本文件上实现复杂的迭代器非常有用.出于充分的原因,其他语言(例如C#)中也存在类似的构造.

I am doing writing code for PhD research and starting to use Scala. I often have to do text processing. I am used to Python, whose 'yield' statement is extremely useful for implementing complex iterators over large, often irregularly structured text files. Similar constructs exist in other languages (e.g. C#), for good reason.

是的,我知道以前有此主题.但是它们看起来像骇人听闻的(或至少没有很好解释的)解决方案,效果不佳,并且往往存在不清楚的局限性.我想写这样的代码:

Yes I know there have been previous threads on this. But they look like hacked-up (or at least badly explained) solutions that don't clearly work well and often have unclear limitations. I would like to write code something like this:

import generator._

def yield_values(file:String) = {
  generate {
    for (x <- Source.fromFile(file).getLines()) {
      # Scala is already using the 'yield' keyword.
      give("something")
      for (field <- ":".r.split(x)) {
        if (field contains "/") {
          for (subfield <- "/".r.split(field)) { give(subfield) }
        } else {
          // Scala has no 'continue'.  IMO that should be considered
          // a bug in Scala.
          // Preferred: if (field.startsWith("#")) continue
          // Actual: Need to indent all following code
          if (!field.startsWith("#")) {
            val some_calculation = { ... do some more stuff here ... }
            if (some_calculation && field.startsWith("r")) {
              give("r")
              give(field.slice(1))
            } else {
              // Typically there will be a good deal more code here to handle different cases
              give(field)
            }
          }
        }
      }
    }
  }
}

我想看一下实现generate()和give()的代码.顺便说一句,give()应该命名为yield(),但是Scala已经采用了该关键字.

I'd like to see the code that implements generate() and give(). BTW give() should be named yield() but Scala has taken that keyword already.

由于不了解的原因,我认为Scala延续可能无法在for语句中使用.如果是这样,generate()应该提供一个等效的函数,该函数尽可能接近for语句,因为具有yield的迭代器代码几乎不可避免地位于for循环内.

I gather that, for reasons I don't understand, Scala continuations may not work inside a for statement. If so, generate() should supply an equivalent function that works as close as possible to a for statement, because iterator code with yield almost inevitably sits inside a for loop.

请,我不希望获得以下任何答案:

Please, I would prefer not to get any of the following answers:

  1. 收益率"糟透了,延续性更好. (是的,通常,您可以使用延续来做更多的事情.但是它们很难理解,而且99%的时间都是您想要或需要的迭代器.如果Scala提供了许多强大的工具,但它们很难使用实际上,这种语言不会成功.)
  2. 这是重复项. (请参阅上面的评论.)
  3. 您应该使用流,连续性,递归等重写代码(请参阅#1.我还将添加,从技术上讲,您也不需要循环.为此,从技术上讲,您可以做所有事情您将需要使用 SKI组合器.)
  4. 您的功能过长.将其分解成较小的部分,您将不需要产量".无论如何,您都必须在生产代码中执行此操作. (首先,在任何情况下,您都不需要'yield'"都是可疑的.其次,这不是生产代码.其次,对于这样的文本处理,通常将功能分成较小的部分-尤其是当语言迫使执行此操作,因为它缺少有用的结构-仅使代码更难易于理解.)
  5. 使用传入的函数重写代码.(从技术上讲,是可以的.但是结果不再是迭代器,链接迭代器比链接函数要好得多.通常,语言不应该强迫我这样做.以一种不自然的风格书写-当然,Scala的创建者通常会相信这一点,因为它们提供了很多语法糖.
  6. 用这种或其他方式,或者我刚才想到的其他一些很棒的好方法来重写您的代码.
  1. 'yield' sucks, continuations are better. (Yes, in general you can do more with continuations. But they are hella hard to understand, and 99% of the time an iterator is all you want or need. If Scala provides lots of powerful tools but they're too hard to use in practice, the language won't succeed.)
  2. This is a duplicate. (Please see my comments above.)
  3. You should rewrite your code using streams, continuations, recursion, etc. etc. (Please see #1. I will also add, technically you don't need for loops either. For that matter, technically you can do absolutely everything you ever need using SKI combinators.)
  4. Your function is too long. Break it up into smaller pieces and you won't need 'yield'. You'd have to do this in production code, anyway. (First, "you won't need 'yield'" is doubtful in any case. Second, this isn't production code. Third, for text processing like this, very often, breaking the function into smaller pieces -- especially when the language forces you to do this because it lacks the useful constructs -- only makes the code harder to understand.)
  5. Rewrite your code with a function passed in. (Technically, yes you can do this. But the result is no longer an iterator, and chaining iterators is much nicer than chaining functions. In general, a language should not force me to write in an unnatural style -- certainly, the Scala creators believe this in general, since they provide shitloads of syntactic sugar.)
  6. Rewrite your code in this, that, or the other way, or some other cool, awesome way I just thought of.

推荐答案

您的问题的前提似乎是您确实需要Python的产量,并且您不希望任何其他合理的建议在不同的地方做同样的事情Scala中的方式.如果这是真的,并且这对您来说很重要,那么为什么不使用Python?这是一种很好的语言.除非你的博士学位属于计算机科学领域,使用Scala是论文的重要组成部分,如果您已经熟悉Python并且真的很喜欢它的某些功能和设计选择,为什么不改用它呢?

The premise of your question seems to be that you want exactly Python's yield, and you don't want any other reasonable suggestions to do the same thing in a different way in Scala. If this is true, and it is that important to you, why not use Python? It's quite a nice language. Unless your Ph.D. is in computer science and using Scala is an important part of your dissertation, if you're already familiar with Python and really like some of its features and design choices, why not use it instead?

无论如何,如果您实际上想学习如何在Scala中解决问题,那么事实证明,对于您拥有的代码,定界延续是过大的.您需要的只是flatMapped迭代器.

Anyway, if you actually want to learn how to solve your problem in Scala, it turns out that for the code you have, delimited continuations are overkill. All you need are flatMapped iterators.

这是您的操作方式.

// You want to write
for (x <- xs) { /* complex yield in here */ }
// Instead you write
xs.iterator.flatMap { /* Produce iterators in here */ }

// You want to write
yield(a)
yield(b)
// Instead you write
Iterator(a,b)

// You want to write
yield(a)
/* complex set of yields in here */
// Instead you write
Iterator(a) ++ /* produce complex iterator here */

就是这样!您的所有案件都可以减少为以下三种情况之一.

That's it! All your cases can be reduced to one of these three.

在您的情况下,您的示例看起来像

In your case, your example would look something like

Source.fromFile(file).getLines().flatMap(x =>
  Iterator("something") ++
  ":".r.split(x).iterator.flatMap(field =>
    if (field contains "/") "/".r.split(field).iterator
    else {
      if (!field.startsWith("#")) {
        /* vals, whatever */
        if (some_calculation && field.startsWith("r")) Iterator("r",field.slice(1))
        else Iterator(field)
      }
      else Iterator.empty
    }
  )
)

P.S. Scala 确实还在继续;就像这样(通过抛出无堆栈(轻量级)异常来实现):

P.S. Scala does have continue; it's done like so (implemented by throwing stackless (light-weight) exceptions):

import scala.util.control.Breaks._
for (blah) { breakable { ... break ... } }

但这不会满足您的需求,因为Scala没有您想要的产量.

but that won't get you what you want because Scala doesn't have the yield you want.

这篇关于在Scala中实现“收益"的首选方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆