为什么 ANTLR 生成的解析器会重用上下文对象? [英] Why does parser generated by ANTLR reuse context objects?

查看:30
本文介绍了为什么 ANTLR 生成的解析器会重用上下文对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 ANTLR 为简单的编程语言创建解释器.我想添加递归功能.

I'm trying to create an interpreter for a simple programming language using ANTLR. I would like to add the feature of recursion.

到目前为止,我已经实现了定义和调用函数,并可以选择使用几个 return 语句和局部变量.为了实现具有局部变量,我用一个字典扩展了 FunctionCallContext 的解析器部分类.我可以成功使用它们一次.此外,当我再次从自身(递归地)调用相同的函数时,解析器会为新函数调用创建一个新的上下文对象,正如我所期望的.但是,如果我创建一个更深"的递归,函数调用的第三个上下文将与第二个上下文完全相同(具有相同的哈希码和相同的局部变量).

So far I have implemented the definition and calling functions with option of using several return statements and also local variables. To achieve having local variables I extended the parser partial class of FunctionCallContext with a dictionary for them. I can successfully use them for one time. Also, when I call the same function again from itself (recursively), the parser creates a new context object for the new function call, as I would expect. However,if I create a "deeper" recursion, the third context of the function call will be the very same as the second (having the same hash code and the same local variables).

我的(更新的)语法:

    grammar BatshG;
/*
 * Parser Rules
 */
compileUnit: ( (statement) | functionDef)+;
statement:      print  ';'
            |   println ';'
            |   assignment ';'
            |   loopWhile
            |   branch 
            |   returnStatement ';'
            |   functionCall ';'
;

branch:
          'if' '(' condition=booleanexpression ')' 
                trueBranch=block 
            ('else' falseBranch=block)?;
loopWhile:
          'while' '(' condition=booleanexpression ')' 
                whileBody=block 
;

block: 
                statement
            |   '{' statement* '}';

numericexpression:      
                MINUS onepart=numericexpression               #UnaryMinus
            |   left=numericexpression op=('*'|'/') right=numericexpression  #MultOrDiv
            |   left=numericexpression op=('+'|'-') right=numericexpression  #PlusOrMinus
            |   number=NUMERIC                                      #Number
            |   variableD                                       #NumVariable
;
stringexpression: 
                left=stringexpression PLUSPLUS right=stringexpression #Concat   
            |   string=STRING #String
            |   variableD                                       #StrVariable
            |   numericexpression #NumberToString
;
booleanexpression:
                        left=numericexpression relationalOperator=('<' | '>' | '>=' | '<=' | '==' | '!=' ) right=numericexpression #RelationalOperation
                    |   booleanliteral #Boolean
                    |   numericexpression #NumberToBoolean
;
booleanliteral: trueConst | falseConst ;
trueConst :  'true'          ;
falseConst :  'false'                ;

assignment : varName=IDENTIFIER EQUAL  right=expression;
expression: numericexpression | stringexpression | functionCall | booleanexpression;
println: 'println' '(' argument=expression ')'; 
print: 'print' '(' argument=expression ')';

functionDef: 'function' funcName= IDENTIFIER
                        '(' 
                            (functionParameters=parameterList)?
                        ')'
                            '{' 
                                statements=statementPart?
                            '}' 
;

statementPart:  statement* ;
returnStatement: ('return' returnValue=expression );
parameterList : paramName=IDENTIFIER (',' paramName=IDENTIFIER)*;

functionCall: funcName=IDENTIFIER '(' 
            (functionArguments=argumentList)?
')';
argumentList: expression (',' expression)*;
variableD: varName=IDENTIFIER;
///*
// * Lexer Rules
// */
NUMERIC: (FLOAT | INTEGER);
PLUSPLUS: '++';
MINUS: '-';
IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]* ;

EQUAL  :            '='              ;

STRING : '"' (~["\r\n] | '""')* '"'          ;

INTEGER: [0-9] [0-9]*;
DIGIT : [0-9]                       ;
FRAC : '.' DIGIT+                   ;
EXP : [eE] [-+]? DIGIT+  ;
FLOAT : DIGIT* FRAC EXP?             ;
WS: [ \n\t\r]+ -> channel(HIDDEN);

    ///*
    // * Lexer Rules
    // */
    NUMERIC: (FLOAT | INTEGER);
    PLUSPLUS: '++';
    MINUS: '-';
    IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]* ;
    EQUAL  :            '='              ;
    STRING : '"' (~["\r\n] | '""')* '"'          ;
    INTEGER: [0-9] [0-9]*;
    DIGIT : [0-9]                       ;
    FRAC : '.' DIGIT+                   ;
    EXP : [eE] [-+]? DIGIT+  ;
    FLOAT : DIGIT* FRAC EXP?             ;
    WS: [ \n\t\r]+ -> channel(HIDDEN);

我编写的部分解析器类(不是生成的部分):

My partial class of parser written by me (not the generated part):

    public partial class BatshGParser
{
    //"extensions" for contexts:
    public partial class FunctionCallContext
    {
        private Dictionary<string, object> localVariables = new Dictionary<string, object>();
        private bool isFunctionReturning;
        public FunctionCallContext()
        {
           localVariables = new Dictionary<string, object>();
           isFunctionReturning = false;
        }

        public Dictionary<string, object> LocalVariables { get => localVariables; set => localVariables = value; }
        public bool IsFunctionReturning { get => isFunctionReturning; set => isFunctionReturning = value; }
    }

    public partial class FunctionDefContext
    {
        private List<string> parameterNames;

        public FunctionDefContext()
        {
            parameterNames = new List<string>();
        }
        public List<string> ParameterNames { get => parameterNames; set => parameterNames = value; }
    }
}

以及我的访问者的相关部分(可能还有更多):

And relevant parts (and maybe a little more) of my visitor:

         public class BatshGVisitor : BatshGBaseVisitor<ResultValue>
    {
        public ResultValue Result { get; set; }
        public StringBuilder OutputForPrint { get; set; }
        private Dictionary<string, object> globalVariables = new Dictionary<string, object>();
        //string = function name
        //object = parameter list
        //object =  return value
        private Dictionary<string, Func<List<object>, object>> globalFunctions = new Dictionary<string, Func<List<object>, object>>();
        private Stack<BatshGParser.FunctionCallContext> actualFunctions = new Stack<BatshGParser.FunctionCallContext>();

        public override ResultValue VisitCompileUnit([NotNull] BatshGParser.CompileUnitContext context)
        {
            OutputForPrint = new StringBuilder("");

            isSearchingForFunctionDefinitions = true;
            var resultvalue = VisitChildren(context);
            isSearchingForFunctionDefinitions = false;
            resultvalue = VisitChildren(context);
            Result = new ResultValue() { ExpType = "string", ExpValue = resultvalue.ExpValue ?? null };

            return Result;
        }
        public override ResultValue VisitChildren([NotNull] IRuleNode node)
        {
            if (this.isSearchingForFunctionDefinitions)
            {
                for (int i = 0; i < node.ChildCount; i++)
                {
                    if (node.GetChild(i) is BatshGParser.FunctionDefContext)
                    {
                        Visit(node.GetChild(i));
                    }
                }
            }
            return base.VisitChildren(node);
        }
        protected override bool ShouldVisitNextChild([NotNull] IRuleNode node, ResultValue currentResult)
        {
            if (isSearchingForFunctionDefinitions)
            {
                if (node is BatshGParser.FunctionDefContext)
                {
                    return true;
                }
                else
                    return false;
            }
            else
            {
                if (node is BatshGParser.FunctionDefContext)
                {
                    return false;
                }
                else
                    return base.ShouldVisitNextChild(node, currentResult);
            }
        }

        public override ResultValue VisitFunctionDef([NotNull] BatshGParser.FunctionDefContext context)
        {

            string functionName = null;
            functionName = context.funcName.Text;
            if (context.functionParameters != null)
            {
                List<string> plist = CollectParamNames(context.functionParameters);
                context.ParameterNames = plist;
            }
            if (isSearchingForFunctionDefinitions)
                globalFunctions.Add(functionName,
                (
                delegate(List<object> args)
                    {
                        var currentMethod = (args[0] as BatshGParser.FunctionCallContext);
                        this.actualFunctions.Push(currentMethod);
                        //args[0] is the context
                        for (int i = 1; i < args.Count; i++)
                        {

                            currentMethod.LocalVariables.Add(context.ParameterNames[i - 1],
                                (args[i] as ResultValue).ExpValue
                                );
                        }

                    ResultValue retval = null;
                        retval = this.VisitStatementPart(context.statements);

                        this.actualFunctions.Peek().IsFunctionReturning = false;
                        actualFunctions.Pop();
                        return retval;
                    }
                 )
            );
            return new ResultValue()
            {

            };
        }       

        public override ResultValue VisitStatementPart([NotNull] BatshGParser.StatementPartContext context)
        {
            if (!this.actualFunctions.Peek().IsFunctionReturning)
            {
                return VisitChildren(context);
            }
            else
            {
                return null;
            }
        }

        public override ResultValue VisitReturnStatement([NotNull] BatshGParser.ReturnStatementContext context)
        {
            this.actualFunctions.Peek().IsFunctionReturning = true;
            ResultValue retval = null;
            if (context.returnValue != null)
            {
                retval = Visit(context.returnValue);
            }

            return retval;
        }

                public override ResultValue VisitArgumentList([NotNull] BatshGParser.ArgumentListContext context)
        {
            List<ResultValue> argumentList = new List<ResultValue>();
            foreach (var item in context.children)
            {
                var tt = item.GetText();
                if (item.GetText() != ",")
                {
                    ResultValue rv = Visit(item);
                    argumentList.Add(rv);
                }
            }
            return
                new ResultValue()
                {
                    ExpType = "list",
                    ExpValue = argumentList ?? null
                };
        }

        public override ResultValue VisitFunctionCall([NotNull] BatshGParser.FunctionCallContext context)
        {
            string functionName = context.funcName.Text;
            int hashcodeOfContext = context.GetHashCode();
            object functRetVal = null;
            List<object> argumentList = new List<object>()
            {
                context
                //here come the actual parameters later
            };

            ResultValue argObjects = null;
            if (context.functionArguments != null)
            {
                argObjects = VisitArgumentList(context.functionArguments);
            }

            if (argObjects != null )
            {
                if (argObjects.ExpValue is List<ResultValue>)
                {
                    var argresults = (argObjects.ExpValue as List<ResultValue>) ?? null;
                    foreach (var arg in argresults)
                    {
                        argumentList.Add(arg);
                    }
                }
            }

            if (globalFunctions.ContainsKey(functionName))
            {
                {
                    functRetVal = globalFunctions[functionName]( argumentList );
                }
            }

            return new ResultValue()
            {
                ExpType = ((ResultValue)functRetVal).ExpType,
                ExpValue = ((ResultValue)functRetVal).ExpValue
            };
        }

        public override ResultValue VisitVariableD([NotNull] BatshGParser.VariableDContext context)
        {

            object variable;
            string variableName = context.GetChild(0).ToString();
            string typename = "";

            Dictionary<string, object> variables = null;
            if (actualFunctions.Count > 0)
            {
                Dictionary<string, object> localVariables = 
                    actualFunctions.Peek().LocalVariables;
                if (localVariables.ContainsKey(variableName))
                {
                    variables = localVariables;
                }
            }
            else
            {
                variables = globalVariables;
            }

            if (variables.ContainsKey(variableName))
            {
                variable = variables[variableName];

                typename = charpTypesToBatshTypes[variable.GetType()];
            }
            else
            {

                Type parentContextType = contextTypes[context.parent.GetType()];
                typename = charpTypesToBatshTypes[parentContextType];
                variable = new object();

                if (typename.Equals("string"))
                {
                    variable = string.Empty;
                }
                else
                {
                    variable = 0d;
                }
            }

            return new ResultValue()
            {
                ExpType = typename,
                ExpValue = variable
            };
        }           
        public override ResultValue VisitAssignment([NotNull] BatshGParser.AssignmentContext context)
        {
            string varname = context.varName.Text;
            ResultValue varAsResultValue = Visit(context.right);
            Dictionary<string, object> localVariables = null;

            if (this.actualFunctions.Count > 0)
            {
                localVariables = 
                    actualFunctions.Peek().LocalVariables;
                if (localVariables.ContainsKey(varname))
                {
                    localVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                if (globalVariables.ContainsKey(varname))
                {
                    globalVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                {
                    localVariables.Add(varname, varAsResultValue.ExpValue);
                }
            }
            else
            {
                if (globalVariables.ContainsKey(varname))
                {
                    globalVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                {
                    globalVariables.Add(varname, varAsResultValue.ExpValue);
                }
            }
            return varAsResultValue;
        }
}

可能导致问题的原因是什么?谢谢!

What could cause the problem? Thank you!

推荐答案

为什么 ANTLR 生成的解析器会重用上下文对象?

Why does parser generated by ANTLR reuse context objects?

它没有.源代码中的每个函数调用都对应一个 FunctionCallContext 对象,并且这些对象是唯一的.即使对于两个完全相同的函数调用,它们也必须如此,因为它们还包含元数据,例如函数调用出现在源中的哪个位置 - 即使其他一切都相同,调用之间显然也会有所不同.

It doesn't. Each function call in your source code will correspond to exactly one FunctionCallContext object and those will be unique. They'd have to be, even for two entirely identical function calls, because they also contain meta data, such as where in the source the function call appears - and that's obviously going to differ between calls even if everything else is the same.

为了说明这一点,请考虑以下源代码:

To illustrate this, consider the following source code:

function f(x) {
  return f(x);
}
print(f(x));

这将创建一个包含两个 FunctionCallContext 对象的树——一个用于第 2 行,一个用于第 4 行.它们都是不同的——它们都有引用函数名称的子节点 f 和参数 x,但它们将具有不同的位置信息和不同的哈希码 - 子节点也是如此.这里没有重用任何东西.

This will create a tree containing exactly two FunctionCallContext objects - one for line 2 and one for line 4. They will both be distinct - they'll both have child nodes referring to the function name f and the argument x, but they'll have different location information and a different hash code - as will the child nodes. Nothing is being reused here.

可能导致问题的原因是什么?

What could cause the problem?

您多次看到同一个节点的事实仅仅是因为您多次访问树的同一部分.对于您的用例,这是完全正常的事情,但在您的情况下,它会导致问题,因为您将可变数据存储在对象中,假设您每次都会获得一个新的 FunctionCall 对象函数调用发生在运行时 - 而不是每次函数调用出现在源代码中.

The fact that you're seeing the same node multiple times is simply due to the fact that you're visiting the same part of the tree multiple times. That's a perfectly normal thing to do for your use case, but in your case it causes a problem because you stored mutable data in the object, assuming that you'd get a fresh FunctionCall object for each time a function call happens at run time - rather than each time a function call appears in the source code.

这不是解析树的工作方式(它们表示源代码的结构,而不是运行时可能发生的调用序列),因此您不能使用 FunctionCallContext 对象来存储信息关于特定的运行时函数调用.一般来说,我认为将可变状态放入上下文对象中是个坏主意.

That's not how parse trees work (they represent the structure of the source code, not the sequence of calls that might happen at run time), so you can't use FunctionCallContext objects to store information about a specific run-time function call. In general, I'd consider it a bad idea to put mutable state into context objects.

相反,您应该将可变状态放入访问者对象中.对于您的特定问题,这意味着调用堆栈包含每个运行时函数调用的局部变量.每次函数开始执行时,您可以将帧压入堆栈,每次函数退出时,您都可以弹出它.这样堆栈的顶部将始终包含当前正在执行的函数的局部变量.

Instead you should put your mutable state into your visitor object. For your specific problem that means having a call stack containing the local variables of each run-time function call. Each time a function starts execution, you can push a frame onto the stack and each time a function exits, you can pop it. That way the top of the stack will always contain the local variables of the function currently being executed.

PS:这与您的问题无关,但算术表达式中通常的优先级规则是,+- 具有相同的优先级*/ 具有相同的优先级.在你的语法中,/ 的优先级高于 *,而 - 的优先级高于 +.这意味着例如 9 * 5/3 将计算为 5,而它应该是 15(假设整数的通常规则算术).

PS: This is unrelated to your problem, but the usual rules of precedence in arithmetic expressions are such that, + has the same precedence as - and * has the same precedence as /. In your grammar the precedence of / is greater than that of * and that of - higher than +. This means that for example 9 * 5 / 3 is going to evaluate to 5, when it should be 15 (assuming the usual rules for integer arithmetic).

要修复这个 +-,以及 */ 应该是相同的一部分规则,因此它们具有相同的优先级:

To fix this + and -, as well as * and / should be part of the same rule, so they get the same precedence:

| left=numericexpression op=('*'|'/') right=numericexpression #MulOrDiv
| left=numericexpression op=('+'|'-') right=numericexpression #PlusOrMinus

这篇关于为什么 ANTLR 生成的解析器会重用上下文对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆