在SBCL内部使用deftransform / defknown获取编译器来转换用户编写的函数 [英] Using deftransform/defknown in SBCL internals to get the compiler to transform user authored functions

查看:80
本文介绍了在SBCL内部使用deftransform / defknown获取编译器来转换用户编写的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在当前SBCL手册第6.5节末尾,我们有以下引用:

At the end of section 6.5 in the current SBCL manual, we have the following quote:


如果由于以下原因导致系统性能下降对于某些原则上可以有效编译但SBCL编译器实际上不能有效编译的构造,请考虑向该编译器编写补丁并将其提交以包括在主要资源中。这样的代码通常很容易编写。在源代码中搜索 deftransform以查找许多示例(一些简单明了,有些则更少)。

If your system's performance is suffering because of some construct which could in principle be compiled efficiently, but which the SBCL compiler can't in practice compile efficiently, consider writing a patch to the compiler and submitting it for inclusion in the main sources. Such code is often reasonably straightforward to write; search the sources for the string "deftransform" to find many examples (some straightforward, some less so).

我一直在玩到处发现了sb-c :: defknown和sb-c :: deftransform之类的东西,但到目前为止,成功添加任何可做任何事情的新转换的运气并不好。

I've been playing around and found the likes of sb-c::defknown and sb-c::deftransform but thus far have had little luck in successfully adding any new transforms that do anything.

假装我具有以下3种玩具功能:

Lets pretend i have the following 3 toy functions:

(defun new-+ (x y)
  (+ x y))

(defun fixnum-+ (x y)
  (declare (optimize (speed 3) (safety 0))
           (fixnum x y))
  (+ x y))

(defun string-+ (x y)
  (declare (optimize (speed 3) (safety 0))
           (string x y))
  (concatenate 'string x y))

作为一个纯粹的玩具示例,可以说我们想告诉编译器可以将对我的用户定义函数 new-+ 的调用转换为对fixnum- +或string- +的调用。

As a purely toy example, lets say we wanted to tell the compiler that it could transform calls to my user defined function new-+ into calls to either fixnum-+ or string-+.

编译条件将(new- + xy)转换为(fixnum- + xy)会知道参数 x y 的类型均为 fixnum ,并且转换为(string- + xy)会知道参数 x y 都是类型 string

The condition for the compiler transforming (new-+ x y) into (fixnum-+ x y) would be knowing that the arguments x and y are both of type fixnum, and the conditions for transforming into (string-+ x y) would be knowing that the arguments x and y are both of type string.

所以问题是:


  1. 我真的可以这样做吗?

  2. 这样做并生成其他基于用户的变换/扩展的实际机制是什么?
  3. 除了手动阅读源之外的任何阅读材料或其他来源,以发现有关此的更多信息?

  4. 如果我无法使用deftransform之类的方法来这样做,还有其他方法可以这样做吗?

  1. Can I actually do this?
  2. What are the actual mechanics of doing so and generating other user based transforms/extensions?
  3. Any reading or sources apart from manually reading through the source to discover more info regarding this?
  4. If i can't do this using the likes of deftransform, is there any other way I could do so?

注意:我知道一般通用Lisp编码中宏和泛型函数的操作和性质,因此不考虑用它们来回答这个问题,因为我特别想扩展SBCL内部结构并与其编译器进行交互。

Note: I'm aware of the operations and nature of macros and generic functions in general common lisp coding, and don't consider using them an answer to this question, since I'm specifically curious about extending the SBCL internals and interacting with its compiler.

推荐答案

我现在尝试提供一个广泛的概述,以回答我的问题,并可能使其他人指向建设性地研究类似的方向。

I now attempt to provide a broad overview that answers my questions and may point others towards constructively investigating similar directions.



  1. 我真的可以这样做吗?


是。尽管取决于具体方式和原因,但您可能会选择一个选项,并且它们在Common Lisp实现之间的可移植性级别可能会有所不同。

Yes. Though depending on the specifics of how and why, you may have a choice of options available to you, and they may have variable levels of portability between Common Lisp implementations.



  1. 这样做并生成其他基于用户的转换/扩展的实际机制是什么?


我回答程序员可能选择的两种可能的方法,这些方法似乎最适用。

I answer this with respect to two possible methods that the programmer may choose to get started, and which seem most applicable.

对于这两个示例,我重申,在对该主题的反思有限的情况下,我认为转换函数的输入/输出映射之间的关系是一种不好的形式。我在此仅出于演示目的,以验证我正在实施的转换是否确实在进行。

For both examples, I reiterate that with limited reflection on the topic, i think it bad form to transform relationships between the input/output mappings of a function. I do so here for demonstration purposes only, to verify that the transformations I'm implementing are actually taking place.

我实际上很难测试我的转换是否真的发生:SBCL特别高兴于优化某些表达式和形式,还有很多其他信息可以使提供给编译器,此处未涵盖。另外,可能还有其他转换可用,仅因为未使用您的转换,并不一定意味着它不起作用。

I actually had quite a difficult time testing my transformations were actually happening: SBCL especially seems quite happy to optimise certain expressions and forms, there are additional pieces of information you can make available to the compiler not covered here. Additionally, there may be other transformations available, and so just because your transform isn't used, doesn't necessarily mean it isn't "working".

使用Common Lisp语言2的环境​​和定义编译器宏扩展

我以前的印象是 DEFINE-COMPILER -MACRO 的能力相对有限,仅在与文字值关联的类型上起作用,但不一定如此。

I was previously under the impression that DEFINE-COMPILER-MACRO was relatively limited in its abilities, working only on types connected with literal values, but this is not necessarily the case.

为此,我使用了三个用户定义的函数和一个编译器宏。

To demonstrate this, i use three user-defined functions and a compiler macro.

首先:我们将从通用的加法函数 gen + 决定在运行时将两个数字相加或连接两个字符串:

First: We will begin with a general addition function gen+ that decides at run-time to either add two numbers together, or concatenate two strings:

(defun gen+ (x y)
  (if (and (numberp x)
           (numberp y))
      (+ x y)
      (concatenate 'string x y)))

但是说我们在编译时知道在某些情况下,仅字符串将被馈送到此函数。让我们定义专门的字符串加法函数,并证明它的实际使用,我们将在样式上做一件非常糟糕的事情,并附加连接字符串 kapow

But say we know at compile time that in certain instances, only strings will be fed to this function. Let's define our specialised string addition function, and to prove its actually being used, we'll do a very bad thing stylistically and additionally concatenate the string "kapow" as well:

(defun string+ (x y)
  (declare (optimize (speed 3) (safety 0))
           (string x y))
  (concatenate 'string x y "kapow"))

以下函数是一个非常简单的便捷函数,它检查环境以确定该环境中绑定的变量的声明类型是否为 eq STRING 。我们在此处使用的是Common Lisp语言2中的NON-ANSI函数。在sbcl中,函数 VARIABLE-INFORMATION 和其他cltl2函数在<$ c中可用$ c> sb-ctlt2 包。

The following function is a very simple convenience function that checks an environment to establish whether the declared type of the variable bound in that environment is eq to STRING. We're using a NON-ANSI function here from Common Lisp the Language 2. In sbcl, the function VARIABLE-INFORMATION, and other cltl2 functions are available in the sb-ctlt2 package.

(defun env-stringp (symbol environment)
  (eq 'string
      (cdr (assoc 'type
                  (nth-value 2 (sb-cltl2:variable-information symbol environment))))))

最后,我们使用 DEFINE-COMPILER-MACRO 生成转换。我试图用与其他示例不同的名称来命名此代码中的内容,以使人们可以跟进,而不会混淆范围/上下文中的变量/符号。关于 DEFINE-COMPILER-MACRO 我以前不知道的几件事。

Lastly, we use DEFINE-COMPILER-MACRO to generate the transformation. I've tried to name things in this code differently from other examples I've seen so that people can follow along and not get mixed up with what variable/symbol is in which scope/context. A couple of things I didn't know previously about DEFINE-COMPILER-MACRO.


  • 紧随& whole 参数的变量是代表初始调用形式的变量。在我们的示例中,它将绑定到列表(GEN + AB)

  • arg1绑定到符号 A

  • arg2绑定到符号 B

  • & environment 参数表示,在此宏中,符号 ENV 将绑定到该宏所在的环境正在评估中。这就是让我们退出宏的种类,并检查周围的代码以获取有关由绑定到 ARG1和 ARG2的符号表示的变量类型的声明的方法。

  • The variable that immediately follows the &whole parameter is a variable which represents the form of the initial call. In our example it will be bound to the list (GEN+ A B)
  • arg1 is bound to the symbol A
  • arg2 is bound to the symbol B
  • The &environment parameter says that within this macro, the symbol ENV will be bound to the environment in which the macro is being evaluated. This is what lets us "kind of step back out of the macro" and check the surrounding code for declarations regarding the type of the variables represented by the symbols bound to 'ARG1' and 'ARG2'

在此定义中,我们告诉编译器宏,如果用户已将 GEN + 的参数声明为字符串,然后将对(GEN + ARG1 ARG2)的调用替换为对(STRING + ARG1 ARG2)的调用。

In this definition, we tell the compiler macro that if the user has declared the parameters of GEN+ to be strings, then replace the call to (GEN+ ARG1 ARG2) with a call to (STRING+ ARG1 ARG2).

请注意,因为此转换的条件是用户对环境进行自定义操作的结果,所以如果 GEN + 的参数如果是文字字符串,则不会触发转换,因为环境看不到变量已声明为字符串。为此,您必须添加另一个选项和转换,以按照传统的 DEFINE-COMPILER-MACRO 的用法显式检查ARG1和ARG2中值的类型。这可以留给读者练习。但是请注意这样做的实用性,因为例如SBCL可能会不断折叠表达式而不是使用转换。

Note that because the condition of this transformation is the result of a user-defined operation on the environment, if the parameters to GEN+ are literal strings, the transformation will not be triggered, because the environment does not see that the variables have been declared strings. To do that, you would have to add another option and transformation to explicitly check the types of the values in ARG1 and ARG2 as per a traditional use of DEFINE-COMPILER-MACRO. This can be left as an exercise for the reader. But beware about the utility of doing so, because SBCL, for instance, might constant-fold your expression rather than use your transformation anyway.

    (define-compiler-macro gen+ (&whole form arg1 arg2 &environment env)
      (cond ((and (env-stringp arg1 env)
                  (env-stringp arg2 env))
             `(string+ ,arg1 ,arg2))
            (t form)))

现在,我们可以使用带有类型声明的简单调用对其进行测试:

Now we can test it with a simple call with type declarations:

(let ((a "bob")
      (b "dole"))
  (declare (string a b))
  (gen+ a b))

这应该返回字符串 bobdolekapow ,因为对 GEN + 的调用已转换为根据变量 A B STRING + $ c>,而不仅仅是文字类型。

This should return the string "bobdolekapow" as the call to GEN+ was transformed into a call to STRING+ based on the declared types of the variables A and B, not just literal types.

使用Basic(已知)/(deftransform)与S的组合BCL实现编译器

以前的技术确实很有用,比转换文字类型更强大,更灵活,虽然不是标准的ANSI Common Lisp,比后面的技术更具移植性/适用于其他实现。

The previous technique is indeed potentially useful, more powerful and flexible than transforming on the types of literals, and while not standard ANSI Common Lisp, is more portable/adaptable to other implementations than the technique that follows.

您可能会优先使用前一种技术而不是其后一种技术的原因是前者没有不能给你一切。您仍然必须声明变量 a b 的类型,并编写用户定义的函数以提取声明的变量。从环境中键入信息。

A reason you might forego the former technique in preference of the one that follows, is that the former doesn't get you everything. You still had to declare the types of the variables a and b and write the user-defined function to extract the declared type information from the environment.

但是,如果您可以直接与SBCL编译器进行交互,而这可能会带来一些脆弱性和极端的不可移植性,那么您现在就可以破解编译器本身并获得诸如类型传播之类的好处:您可能不需要显式通知 A B 来实现您的转换。

If you can interact directly with the SBCL compiler however, with the cost of potentially some brittle-ness and extreme non-portability, you now gain the ability to hack into the compiler itself and gain the benefits of things like type propagation: you might not need to explicitly inform the compiler of the types of A and B for it to implement your transformation.

在我们的示例中,我们将对功能 wat string-wat ,其形式与我们以前的函数 gen + string +

For our example, we will implement a very basic transformation on the functions wat and string-wat, which are identical in form to our previous functions gen+ and string+.

了解更多信息和优化信息,您可以使用此处未介绍的SBCL编译器。如果对SBCL内部人员更有经验的人想要在此处纠正/扩大与我的印象有关的任何内容,请发表评论,我将很乐意更新我的答案:

Understand there are many more pieces of information and optimisation you can feed the SBCL compiler not covered here. And if anyone more experienced with SBCL internals wants to correct/extent anything regarding my impressions here, please comment and i'll be happy to update my answer:

首先,我们告诉编译器关于 wat 的存在和类型签名。我们通过在 sb-c 包中调用 defknown 并通知它 wat 接受两个任意类型的参数:(TT)并返回一个任意类型的值: *

First we tell the compiler about the existence and type signature of wat. We do this by calling defknown in the sb-c package and inform it that wat takes two parameters of any type: (T T) and that it returns a single value of any type: *

(sb-c:defknown wat (T T) *)

然后,我们使用 sb-c:deftransform 定义一个简单的转换,实质上是说当两个馈给 wat 的参数是字符串,我们将代码转换为对 string-wat 的调用。

Then we define a simple transform using sb-c:deftransform, essentially saying when the two parameters fed to wat are strings, we transform the code into a call to string-wat.

(sb-c:deftransform wat ((x y) (string string) *) 
  `(string-wat x y))

wat 完整性检查 $ b

The forms of wat and string-wat for completeness:

(defun wat (x y)
  (if (and (numberp x)
           (numberp y))
      (+ x y)
      (concatenate 'string x y)))

(defun string-wat (x y)
  (declare (optimize (speed 3) (safety 0))
           (string x y))
  (concatenate 'string x y "watpow"))

在使用绑定变量但没有显式类型声明的SBCL中进行演示:

And this time a demonstration in SBCL using bound variables but no explicit type declarations:

(let ((a (concatenate 'string "bo" "b"))
       (b (concatenate 'string "dole")))
   (wat a b))

返回的字符串应为 bobdolewatpow



  1. 除了手动阅读源代码以发现更多有关此信息以外的任何阅读材料或源代码?


我对此一无所知,并且要说得更深一点,您将不得不开始进行一些探索源代码。

I haven't been able to find anything much about this out there, and would say that to get much deeper, you're going to have to start trawling through some source code.

SBCL github镜像当前可用此处

SBCL github mirror is currently available here.

用户@PuercoPop建议使用backgroun d阅读开始在SBCL上进行黑客攻击用于CMU通用Lisp的Python编译器,尽管我提供的是.pdf版本的链接,而不是通常链接到的.ps版本。

User @PuercoPop has suggested background reading of Starting to Hack on SBCL and The Python Compiler for CMU Common Lisp, albeit I am including a link to a .pdf version rather than a .ps version commonly linked to.

这篇关于在SBCL内部使用deftransform / defknown获取编译器来转换用户编写的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆