什么是功能/闭包/ Lambdas,从数据结构的角度? [英] What Are Functions/Closures/Lambdas, From A Data Structures Perspective?

查看:156
本文介绍了什么是功能/闭包/ Lambdas,从数据结构的角度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一天我讨论了一些关于编程语言的细节细节,一个主题是从数据结构的角度来看实际上是一个函数(或闭包/ lambda /等) 。要清楚,我不是问什么函数做什么或如何使用它们,而是它们如何在代码幕后表示。



当函数是被调用,代码执行时发生了什么?是一个函数是使用标识符传递的某种结构或对象吗?一个函数是否甚至由数据结构表示?它在语言之间是否有显着差异,或者是在编译/解释代码中通常用相同的方式表示的函数吗?我一直假设函数只是影响数据的操作集合,并且它们不以任何类型的结构/对象形式存在,但是我开始怀疑,因为有很多情况下函数被视为变量/参数,



在Swift中的示例:

  let closure = {(firstInt:Int,secondInt:Int) - > Bool in 
return firstInt> secondInt
}

这种行为如何操作(在Swift或任何支持闭包)?什么是存储在闭包常数?当我在Python中调用.map(function,iterable)时,什么是发送到该调用来表示该函数?我希望这个问题看起来不太广泛,但我认为知道函数是如何构建和操作的,这将是有趣的和可能有帮助。我遇到的大多数人似乎只关注变量/结构/对象等的细节。

解决方案

我会回答这个问题。一个JavaScript的透视图,但我认为这应该提供一个非常一般的情况下命令式语言与一流的函数。



函数值



在JavaScript中,函数只是一个对象,可以像任何其他对象一样被赋予属性:

  var someObj = {}; 
someObj.someProperty =嘿,我现在有一个财产;

var someFunc = function(a,b){return a> b; };
someFunc.someProperty =嘿,我现在有一个财产了;

因此,任何你可以做的对象,你可以做一个函数。但是,函数对象有一个内部属性(JavaScript执行引擎可访问的名称属性,但JavaScript代码不可访问) [[Code]] 和另一个内部属性 [[FormalParameters]] 规范将这些定义为:


[[Code]] - 函式的ECMAScript程式码。



[[FormalParameters]] - 一个可能为空的List,包含一个函数的FormalParameterList的字符串。


所以,如果你的语言不支持闭包,这真的是你需要的:一个参数名称列表和一些代码执行(这里,代码语言通常用于表示函数外的可运行代码:编译的二进制,可解释的命令或中间字节码)。这是很容易处理这些属性,大致相同的方式你的语言已经处理的正常属性。每当您调用 someFunc(val1,val2)时,您在运行 [[Code]] a b 形式参数的值。



如果你的语言需要类型化的返回值,这可能是第三个属性,当你尝试 return 时检查。此外,您的语言可能需要输入的形式参数,在wihch中,请考虑正式参数列表 {string,type} 元组的列表。



闭包



当你处理闭包时,事情变得更糟。要查看,关闭是


  1. 功能代码(加上参数和返回类型)和

  2. 自由变量(或非本地变量),功能代码可访问的变量的层次集合,以及映射到这些变量的标识符名称

在JavaScript中,函数可以看到在包含它们的函数中定义的所有变量:

  function foo(){
var qux = 5;
function bar(){return qux; }
return bar;
}

var returnedBar = foo();
returnsBar();

这里, bar c $ c> foo ,所以当 bar 运行时,它可以查看up范围链,看到 c>



这需要一个精心设计的数据结构网络才能正常工作。在我的例子中, bar 可以访问 foo 的变量的值。具体来说, bar 可以访问 foo 中定义的变量,用于 foo 导致定义 bar 的特定定义。如果我们多次调用 foo ,事情会很快变得毛茸茸:

  function foo(quxVal){
var qux = quxVal;
function bar(){return qux; }
return bar;
}

var returnedBar1 = foo(1);
returnedBar1();

var returnedBar2 = foo(2);
returnedBar2();

这里, returnedBar1 returnedBar2 具有相同的功能代码,但它们的词汇环境不同,因为它们有不同的父 foo

关闭数据结构



在JavaScript中,范围是基于功能的。 环境记录 / strong> 跟踪每个作用域包含的变量,并将标识符名称映射到这些变量。环境记录表可能如下所示:

 名称变量
---- ------- -
foo [这个var的值就是现在的地址0x34F6A2的任何值]
bar [这个var的值是任何地址0x34F6A6的值]

A 词汇环境 是一个保存环境记录的结构。每个词法环境是范围链的一部分,表示为链表:


词法环境由一个环境记录和一个可能空引用外部词法环境。


这是我们如何看起来上到父函数, bar 查看 foo 的值为 qux 。我们遍历范围链,这是一个链接的词汇环境结构列表,向上指向更高的范围。每个词汇环境都具有环境记录(变量标识符名称到变量的映射)和指向其父词法环境。



当我们调用 foo(1)时,它会创建一个新的词法环境记录作为运行该函数的一部分。 foo(1)词法环境有一个环境记录,其中包含 qux $ c> foo 。然后,当我们调用 bar 时,创建另一个词法环境,它指向父 foo(1) 。 ( bar 函数具有 [[Scope]] 内部属性,在函数调用时使用当我们调用 returnedBar1 时,引擎在查询中查找 qux 变量 bar 环境记录优先;那么,当失败时,引擎查看父词法环境,找到 foo 中声明的 qux p>

当我们调用 foo(2)时,我们创建一个全新的词法环境,然后调用 bar 创建一个新的词法环境,它是 foo(2)词汇环境的子元​​素。当我们调用 returnedBar2 时,我们将一个完全不同的链传播到 foo(2)



在实际数据结构术语中,每个函数都有一个 [[Scope]] 属性,时间,指示哪个词法环境函数应该使用作为它自己的词法环境的父代,当函数运行时。环境记录(词汇环境的内容)可以表示为变量名称到变量的映射(其中变量是具有指向该变量当前表示的任何值的指针的结构)。



结论



最后,下面是闭包需要的实用属性:




  • 代码(某种可运行代码) - 用于运行函数

  • (字符串名称和可选的类型列表) - 用于在调用函数时标识函数的参数

  • 返回类型(类型) - 用于验证函数的返回值

  • 词法作用域对象(词法环境) - 当函数被调用时,在函数调用期间创建的词法环境使用这个词法环境对象作为它的父



词法环境由两部分组成:




  • 环境记录(变量名映射到变量) - 用于查找与给定标识符名称相关联的变量(例如,我们需要一个名为 / code> - 在哪里可以设置/获取其在内存中的值?)

  • 父词法环境词汇环境,用作链表的一部分,允许范围链中的变量查找


I got into a discussion the other day about some nitty-gritty details of programming languages, and one topic that was brought up was what a function (or closure/lambda/etc.) actually 'is' from a data structures perspective. To be clear, I am not asking what functions do or how they are used, but rather how they are represented in code 'behind the scenes'.

When a function is 'called', what is happening as the code executes? Is a function a struct or object of some kind that is passed around using an identifier? Is a function even represented by a data structure at all? Does it vary significantly from language to language, or are functions generally represented the same way in the compiled/interpreted code? I had always assumed that functions were just sets of operations that affect data and that they don't exist in any sort of struct/object form, but then I started to wonder because there are many cases where functions are treated as variables/parameters and can be used in a manner very similar to structs or objects, for instance with closures.

Example in Swift:

let closure = {(firstInt: Int, secondInt: Int) -> Bool in
    return firstInt > secondInt
}

How does this sort of behavior operate (in Swift or any language that supports closures)? What is being stored in the 'closure' constant there? When I call .map(function, iterable) in Python, what is being 'sent' to that call to represent the function? I hope that this question doesn't seem overly broad, but I thought it would be interesting and possibly helpful to know how functions are built and operated on. Most people I've encountered seem to focus only on the details of how variables/structs/objects/etc. operate without paying much attention (or at least not as much) to the inner workings of functions.

解决方案

I'll answer from a JavaScript perspective, but I think this should provide a pretty general case for imperative languages with first-class functions.

Function values

In JavaScript, a function is just an object, which can be given properties like any other object:

var someObj = {};
someObj.someProperty = "hey, I have a property now";

var someFunc = function(a, b) { return a > b; };
someFunc.someProperty = "hey, I have a property now, too";

Thus, anything you can do to an object, you can do to a function. However, function objects have an internal property (a notional property accessible to the JavaScript execution engine, but not accessible to JavaScript code) called [[Code]], and another internal property called [[FormalParameters]]. The spec defines these as:

[[Code]] - The ECMAScript code of a function.

[[FormalParameters]] - A possibly empty List containing the identifier Strings of a Function’s FormalParameterList.

So, if your language doesn't support closures, that's really all you need: an list of parameter names and some code to execute (here, "code" is whatever your language normally uses to express runnable code outside of functions: compiled binary, interpretable commands, or intermediate bytecode). It's easy enough to treat these properties in roughly the same way your language already treats normal properties. Whenever you call someFunc(val1, val2), you run the contents of the [[Code]] property after supplying values for the a and b formal parameters.

If your language requires typed return values, that's probably a third property, checked at the point when you try to return something. Also, your language may require typed formal parameters, in wihch case consider the formal parameter list a list of {string, type} tuples.

Closures

Things get much worse when you're dealing with closures. To review, a closure is

  1. functional code (plus parameters and a return type), and
  2. free variables (or non-local variables), a hierarchical set of variables accessible to the functional code, coupled with the identifier names that map onto those variables

In JavaScript, functions can see all variables defined in the functions that contain them:

function foo() {
    var qux = 5;
    function bar() { return qux; }
    return bar;
}

var returnedBar = foo();
returnedBar();

Here, bar is inside of foo, so when bar runs, it can look "up" the scope chain to see the qux variable from foo.

This needs an elaborate network of data structures to work properly. In my example, bar can access the value of foo's variables. Specifically, bar can access the variables defined inside of foo for the given invocation of foo that caused that particular definition of bar to become defined. Things can get hairy very quickly if we invoke foo multiple times:

function foo(quxVal) {
    var qux = quxVal;
    function bar() { return qux; }
    return bar;
}

var returnedBar1 = foo(1);
returnedBar1();

var returnedBar2 = foo(2);
returnedBar2();

Here, returnedBar1 and returnedBar2 have the same functional code, but their lexical environments are different, because they have different "parent" foo invocations with differing environment records.

Closure Data Structures

In JavaScript, scope is function based. An environment record in JavaScript keeps track of what variables each scope contains, and it maps identifier names to those variables. An environment record table might look like this:

Name    Variable
----    --------
foo     [value of this var is whatever is in address 0x34F6A2 right now]
bar     [value of this var is whatever is in address 0x34F6A6 right now]

A lexical environment is a structure that holds an environment record. Each lexical environment is part of a scope chain, represented as a linked-list:

A Lexical Environment consists of an Environment Record and a possibly null reference to an outer Lexical Environment.

This is how we can look "up" into "parent" functions, as we do when bar looks at foo for its value of qux. We traverse the scope chain, which is a linked list of lexical environment structures, directed upward into higher scopes. Each lexical environment has an environment record (mapping of variable identifier names to variables) and points to its parent lexical environment.

When we call foo(1) above, it creates a new lexical environment record as part of running the function. That foo(1) lexical environment has an environment record, which contains a record for the qux variable inside foo. Then, when we call bar, that creates another lexical environment, which points to the parent foo(1) lexical environment. (The bar function has a [[Scope]] internal property, which is used at function-invocation time to tell the new lexical environment what its parent is.) When we call returnedBar1, the engine looks for the qux variable in the bar environment record first; then, when that fails, the engine looks at the parent lexical environment to find the qux declared in foo.

When we call foo(2), we create a totally new lexical environment, and then we call bar to create a new lexical environment that is a child of the foo(2) lexical environment. When we call returnedBar2, we propagate up a totally different chain to the foo(2).

In practical data structure terms, each function has a [[Scope]] property, set at definition time, that indicates which lexical environment that function should use as a parent for its own lexical environment whenever the function runs. An environment record (the "content" of a lexical environment) can be represented as a mapping of variable names to variables (where a "variable" is a structure with a pointer to whatever value the variable currently represents).

Conclusion

To conclude, here's the practical properties a closure needs:

  • Code (some kind of runnable code) - used to run the function
  • Parameter list (list of string names and, optionally, types) - used to identify a function's arguments when it is called
  • Return type (type) - used to validate the function's return value
  • Lexical scope object (lexical environment) - when the function is called, the lexical environment created during the function invocation uses this lexical environment object as its parent

A lexical environment consist of two parts:

  • Environment record (mapping of variable names to variables) - used to find the variable associated with a given identifier name (e.g., we want a variable named "qux" -- where can we set/get its value in memory?)
  • Parent lexical environment (lexical environment object) - A parent lexical environment, used as part of a linked list, to allow variable look-ups up the scope chain

这篇关于什么是功能/闭包/ Lambdas,从数据结构的角度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆