如何使用 ANTLR 解析 JavaScript 函数表达式调用? [英] How to parse JavaScript function expression calls with ANTLR?

查看:37
本文介绍了如何使用 ANTLR 解析 JavaScript 函数表达式调用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Patrick Hulsmeijer EcmaScript 3 语法 使用 ANTLR 构建 JavaScript 工具.

I am building a JavaScript instrumentor with ANTLR, using the Patrick Hulsmeijer EcmaScript 3 grammar.

我在解析这行代码时遇到问题:

I'm having a problem parsing this line of code:

function(){}();

这是一个函数表达式的直接调用.解析器将语句识别为函数声明,然后在找到函数体后面的括号时失败.原因是函数声明以最高优先级被识别,以避免函数表达式的歧义.

that is a direct call of a function expression. The parser recognizes the statement as a function declaration and then fails when it finds the parentheses after the function body. The reason is that function declarations are recognized with most precedence to avoid the ambiguity with function expressions.

这是语法识别函数声明的方式:

This is how the grammar recognizes function declarations:

sourceElement
options
{
    k = 1 ;
}
    : { input.LA(1) == FUNCTION }? functionDeclaration
    | statement
    ;

我什至不确定它是一个有效的 EcmaScript 语句.是吗?
我觉得这样写应该更正确:

I am not even sure that it is a valid EcmaScript statement. Is it?
I think it should be more correct to write:

(function(){})();

实际上解析器处理得很好.
顺便说一句,这不是问题的核心,因为我无法控制要检测的代码.

which is actually well handled by the parser.
By the way this is not the core of the question, because I have no control over the code to instrument.

我试图从 sourceElement 产生式中消除 functionDeclaration 并将其放入 statement语句尾产生:

I tried to eliminate functionDeclaration from the sourceElement production and to put it in the statementstatementTail production:

statementTail
    : variableStatement
    | emptyStatement
    | expressionStatement
    | functionDeclaration
    | ifStatement
    | ...
    ;

但是出现构建错误:

[fatal] 规则 statementTail 有由于递归的非 LL(*) 决定可从 alts 访问的规则调用3,4.通过左因子分解或使用句法谓词或使用backtrack=true 选项.
|---> : variableStatement

[fatal] rule statementTail has non-LL(*) decision due to recursive rule invocations reachable from alts 3,4. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.
|---> : variableStatement

因为 variableStatement 产生式包含 functionExpression 作为后代,这会导致歧义.解析器不能在 functionDeclarationfunctionExpression 之间进行选择,因为它们几乎相等:

because the variableStatement production contains functionExpression as a descendant, which leads to an ambiguity. The parser cannot choose among functionDeclaration and functionExpression because they are almost equal:

functionDeclaration
    : FUNCTION name=Identifier formalParameterList functionBody
    -> ^( FUNCTIONDECL $name formalParameterList functionBody )
    ;

functionExpression
    : FUNCTION name=Identifier? formalParameterList functionBody
    -> ^( FUNCTIONEXPR $name? formalParameterList functionBody )
    ;

注意:我使用不同的树节点(FUNCTIONDECL 和 FUNCTIONEXPR)修改了原始重写规则,因为我在走 AST 时需要它.

Note: I modified the original rewrite rules using different tree nodes (FUNCTIONDECL and FUNCTIONEXPR) because I need it while walking the AST.

我该如何解决这种歧义?

推荐答案

当 sourceElement 以 'function' 关键字开头时,解析器期待 functionDeclaration 是正确的.这实际上实现了 ECMAScript 语言规范中的以下限制:

The parser is right to expect a functionDeclaration, when a sourceElement begins with the 'function' keyword. This in fact implements the following restriction from the ECMAScript Language Specification:

一个 ExpressionStatement 无法启动使用 function 关键字,因为可能会使它变得模棱两可函数声明.

an ExpressionStatement cannot start with the function keyword because that might make it ambiguous with a FunctionDeclaration.

因此,根据上述限制,所讨论的语句是无效的,尽管实际上它通过语法的产生式并没有歧义:因为它省略了函数标识符,所以它不能是 functionDeclaration.暴露句法歧义的语句是

The statement in question thus is invalid per the above restriction, though in fact it is not ambiguous by productions of the grammar: as it omits the function identifier, it cannot be a functionDeclaration. A statement exposing the syntactic ambiguity would be

function f(){}(42)

根据 ECMAScript 规范,它是一个 functionDeclaration,后跟一个 expressionStatement.

which according to the ECMAScript spec is a functionDeclaration, followed by an expressionStatement.

因此,最好的做法是向此代码的提供者询问正确的语法.你说无论如何你都需要解析它,这可能可以使用 ANTLR 的回溯来完成.确保函数标识符在 functionDeclaration 中是必需的,并让它在语句之前尝试 functionDeclaration.但请注意,即使这对原始语句有帮助,它也会失败

So the best thing to do is ask the provider of this code for correct syntax. You were saying that you need to parse it anyway, and that could possibly be done using ANTLR's backtracking. Make sure the function identifier is mandatory in the functionDeclaration, and have it try a functionDeclaration before a statement. But be aware that, even if this helps for the original statement, it will fail for

function f(){}()

因为这里的functionDeclaration可以成功完成,但是后面没有有效的语句.

because here the functionDeclaration can be completed successfully, but there is no valid statement following it.

这篇关于如何使用 ANTLR 解析 JavaScript 函数表达式调用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆