如何安全地读取不受信任的Clojure代码(而不仅仅是一些序列化数据)? [英] How to safely read untrusted Clojure code (not just some serialized data)?

查看:119
本文介绍了如何安全地读取不受信任的Clojure代码(而不仅仅是一些序列化数据)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(def evil-code (str "(" (slurp "/mnt/src/git/clj/clojure/src/clj/clojure/core.clj") ")" ))
(def r (read-string evil-code ))

不安全

(def r (clojure.edn/read-string evil-code))
RuntimeException Map literal must contain an even number of forms  clojure.lang.Util.runtimeException (Util.java:219)

工作...

如何安全地将Clojure代码(按需要预装所有#)读入树中?想象一下Clojure防病毒软件想要扫描代码中的威胁,并希望使用数据结构而不是纯文本。

How to read Clojure code (presering all '#'s as themselves is desirable) into a tree safely? Imagine a Clojure antivirus that want to scan the code for threats and wants to work with data structure, not with plain text.

推荐答案

首先,您永远不要直接从不受信任的数据源读取clojure代码。您应该改用EDN或其他序列化格式。

First of all you should never read clojure code directly from untrusted data sources. You should use EDN or another serialization format instead.

从Clojure 1.5开始,有一种安全的方式来读取字符串而不逃避它们。您应该绑定 read-eval 在使用读取字符串之前,将var更改为false。在Clojure 1.4和更早版本中,这可能导致由Java构造函数调用引起的副作用。这些问题已得到解决。

That being said since Clojure 1.5 there is a kind of safe way to read strings without evaling them. You should bind the read-eval var to false before using read-string. In Clojure 1.4 and earlier this potentially resulted in side effects caused by java constructors being invoked. Those problems have since been fixed.

以下是示例代码:

(defn read-string-safely [s]
  (binding [*read-eval* false]
    (read-string s)))

(read-string-safely "#=(eval (def x 3))")
=> RuntimeException EvalReader not allowed when *read-eval* is false.  clojure.lang.Util.runtimeException (Util.java:219)

(read-string-safely "(def x 3)")
=> (def x 3)

(read-string-safely "#java.io.FileWriter[\"precious-file.txt\"]")
=> RuntimeException Record construction syntax can only be used when *read-eval* == true  clojure.lang.Util.runtimeException (Util.java:219)

关于阅读器宏

在读取时调用了调度宏(#)和带标记的文字。 Clojure数据中没有针对它们的表示,因为到那时所有这些构造都已被处理。据我所知,并没有生成Clojure代码语法树的方法。

The dispatch macro (#) and tagged literals are invoked at read time. There is no representation for them in Clojure data since by that time these constructs all have been processed. As far as I know there is no build in way to generate a syntax tree of Clojure code.

您将必须使用外部解析器来保留该信息。您可以滚动自己的自定义解析器,也可以使用诸如Instaparse和ANTLR之类的解析器生成器。可能很难找到其中一个库的完整Clojure语法,但是您可以扩展其中一种EDN语法以包含其他Clojure形式。一个快速的Google透露了一种针对Clojure语法的ANTLR语法,您可以更改它以支持必要时缺少的构造。

You will have to use an external parser to retain that information. Either you roll your own custom parser or you can use a parser generator like Instaparse and ANTLR. A complete Clojure grammar for either of those libraries might be hard to find but you could extend one of the EDN grammars to include the additional Clojure forms. A quick google revealed an ANTLR grammar for Clojure syntax, you could alter it to support the constructs that are missing if needed.

还有 Sjacket 一个为Clojure工具制作的库,该库需要保留有关源代码本身的信息。这似乎很适合您的工作,但我个人没有任何经验。从测试来看,它的解析器中确实支持阅读器宏。

There is also Sjacket a library made for Clojure tools that need to retain information about the source code itself. It seems like a good fit for what you are trying to do but I don't have any experience with it personally. Judging from the tests it does have support for reader macro's in its parser.

这篇关于如何安全地读取不受信任的Clojure代码(而不仅仅是一些序列化数据)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆