实施“生成器".支持自定义语言 [英] Implementing "Generator" support in a custom language

查看:103
本文介绍了实施“生成器".支持自定义语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在语言设计上有点偏爱,目前正在玩自己的业余语言. ( http://rogeralsing.com/2010/04/14/playing- with-plastic/)

I've got a bit of fettish for language design and I'm currently playing around with my own hobby language. (http://rogeralsing.com/2010/04/14/playing-with-plastic/)

真正让我流血的一件事是发电机"和收益"关键字. 我知道C#使用AST转换将枚举器方法转换为状态机.

One thing that really makes my mind bleed is "generators" and the "yield" keyword. I know C# uses AST transformation to transform enumerator methods into statemachines.

但是它如何用其他语言工作? 有什么方法可以在不进行AST转换的语言中获得生成器支持吗? 例如像Python或Ruby这样的语言是否会诉诸AST转换来解决此问题?

But how does it work in other languages? Is there any way to get generator support in a language w/o AST transformation? e.g. Does languages like Python or Ruby resort to AST transformations to solve this to?

(问题是如何在引擎盖下以不同的语言实现生成器,而不是如何用其中一种语言编写生成器)

(The question is how generators are implemented under the hood in different languages, not how to write a generator in one of them)

推荐答案

生成器基本上是半协程,但有一些令人讨厌的限制.因此,显然,您可以使用半协程(当然还有完整协程)来实现它们.

Generators are basically semi-coroutines with some annoying limitations. So, obviously, you can implement them using semi-coroutines (and full coroutines, of course).

如果没有协程,则可以使用任何其他通用控制流构造.从每个控制流结构(包括所有其他通用控制流结构)的意义上讲,有很多通用"控制流结构,包括协程,因此生成器可以(或多或少) )转化为仅通用结构.

If you don't have coroutines, you can use any of the other universal control flow constructs. There are a lot of control flow constructs that are "universal" in the sense that every control flow construct (including all the other universal control flow constructs), including coroutines and thus generators can be (more or less) trivially transformed into only that universal construct.

其中最知名的可能是GOTO.仅需GOTO,就可以构建 any 其他控制流构造:IF-THEN-ELSEWHILEFORREPEAT-UNTILFOREACH,异常,线程,子例程调用,方法调用,函数调用等,当然还有协程和生成器.

The most well-known of those is probably GOTO. With just GOTO, you can build any other control flow construct: IF-THEN-ELSE, WHILE, FOR, REPEAT-UNTIL, FOREACH, exceptions, threads, subroutine calls, method calls, function calls and so on, and of course also coroutines and generators.

几乎所有CPU都支持GOTO(尽管在CPU中通常将其称为jmp).实际上,在许多CPU中,GOTO是仅 的控制流结构,尽管今天本机至少支持子例程调用(call)以及某些原始形式的异常处理和/或并发支持.通常也内置(比较和交换).

Almost all CPUs support GOTO (although in a CPU, they usually call it jmp). In fact, in many CPUs, GOTO is the only control flow construct, although today native support for at least subroutine calls (call) and maybe some primitive form of exception handling and/or concurrency primitive (compare-and-swap) are usually also built in.

另一个众所周知的控制流原语是延续.基本上,延续是GOTO的结构化更好,更易于管理且不那么邪恶的变体,在功能语言中尤其如此.但是也有一些低级语言将控制流基于连续性,例如Parrot虚拟机将连续性用于控制流,我相信某个研究实验室甚至在某个地方甚至有一些基于连续性的CPU.

Another well-known control flow primitive are continuations. Continuations are basically a more structured, better manageable and less evil variant of GOTO, especially popular in functional languages. But there also some low-level languages that base their control flow on continuations, for example the Parrot Virtual Machine uses continuations for control flow and I believe there are even some continuation-based CPUs in some research lab somewhere.

C具有某种"cr脚"形式的延续(setjmplongjmp),它们的功能远不如真实"延续强大,并且较不易于使用,但它们足够强大,足以实现生成器(实际上,可用于实现完整的延续).

C has a sort-of "crappy" form of continuations (setjmp and longjmp), that are much less powerful and less easy to use than "real" continuations, but they are plenty powerful enough to implement generators (and in fact, can be used to implement full continuations).

在Unix平台上,setcontext可以用作setjmp/longjmp的更强大,更高级的选择.

On a Unix platform, setcontext can be used as a more powerful and higher level alternative to setjmp/longjmp.

另一种众所周知的控制流程构造是例外,但由于底层底层构建 other 控制流程构造是众所周知的,因此可能不会引起人们的注意.有一篇论文表明,异常可能比延续更强大,因此使异常本质上等效于GOTO,因此具有普遍的功能.而且,实际上,有时 用作异常控制流结构:Microsoft Volta项目将.NET字节码编译为JavaScript,并使用JavaScript异常来实现.NET线程和生成器.

Another control flow construct that is well-known, but doesn't probably spring to mind as a low-level substrate build other control flow constructs on top of, are exceptions. There is a paper that shows that exceptions can be more powerful than continuations, thus making exceptions essentially equivalent to GOTO and thus universally powerful. And, in fact, exceptions are sometimes used as universal control flow constructs: the Microsoft Volta project, which compiled .NET bytecode to JavaScript, used JavaScript exceptions to implement .NET threads and generators.

不是通用的,但可能足够强大以实现生成器,只是普通的尾部调用优化. (但是,我可能是错的.不幸的是,我没有证据.)我认为,您可以将生成器转换为一组相互尾递归的函数.我知道状态机可以使用尾部调用来实现,所以我很确定生成器也可以,因为毕竟C#将生成器实现为状态机. (我认为这与惰性评估特别有效.)

Not universal, but probably powerful enough to implement generators is just plain tail call optimization. (I might be wrong, though. I unfortunately don't have a proof.) I think you can transform a generator into a set of mutually tail-recursive functions. I know that state machines can be implemented using tail calls, so I'm pretty sure generators can, too, since, after all, C# implements generators as state machines. (I think this works especially well together with lazy evaluation.)

最后但并非最不重要的一点是,在具有标准化调用堆栈的语言中(例如,像大多数Smalltalks一样),您可以构建几乎任何所需的控制流构造. (实际上,标准化的调用栈基本上是过程上的低端等效于功能上的高层延续.)

Last but not least, in a language with a reified call stack (like most Smalltalks for example), you can build pretty much any kind of control flow constructs you want. (In fact, a reified call stack is basically the procedural low-level equivalent to the functional high-level continuation.)

那么,生成器的其他实现是什么样的?

So, what do other implementations of generators look like?

Lua本身没有生成器,但是具有完整的不对称协程.主要的C实现使用setjmp/longjmp来实现它们.

Lua doesn't have generators per se, but it has full asymmetric coroutines. The main C implementation uses setjmp/longjmp to implement them.

Ruby本身还没有生成器,但是它有Enumerator可用作生成器. Enumerator不是语言的一部分,它们是库功能. MRI使用延续来实现Enumerator,而延续又通过setjmp/longjmp实现. YARV使用Fiber(这是Ruby拼写协程"的方式)实现Enumerator,而那些是使用setjmp/longjmp实现的.我相信JRuby当前使用线程来实现Enumerator,但是他们希望在JVM获得更好的控制流构造后立即切换到更好的状态.

Ruby also doesn't have generators per se, but it has Enumerators, that can be used as generators. Enumerators are not part of the language, they are a library feature. MRI implements Enumerators using continuations, which in turn are implemented using setjmp/longjmp. YARV implements Enumerators using Fibers (which is how Ruby spells "coroutines"), and those are implemented using setjmp/longjmp. I believe JRuby currently implements Enumerators using threads, but they want to switch to something better as soon as the JVM gains some better control flow constructs.

Python的生成器实际上或多或少是成熟的协程. CPython使用setjmp/longjmp来实现它们.

Python has generators that are actually more or less full-blown coroutines. CPython implements them using setjmp/longjmp.

这篇关于实施“生成器".支持自定义语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆