递归下降解析器实现 [英] Recursive descent parser implementation

查看:93
本文介绍了递归下降解析器实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找写递归下降解析器的伪代码.现在,我没有这种编码的经验.我已经在线阅读了一些示例,但它们仅适用于使用数学表达式的语法.这是我基于解析器的语法.

I am looking to write some pseudo-code of a recursive descent parser. Now, I have no experience with this type of coding. I have read some examples online but they only work on grammar that uses mathematical expressions. Here is the grammar I am basing the parser on.

S -> if E then S | if E then S else S | begin S L | print E

L -> end | ; S L

E -> i

我必须编写方法S()L()E()并返回一些错误消息,但是我在网上找到的教程并没有太大帮助.谁能指出我正确的方向并举一些例子?.

I must write methods S(), L() and E() and return some error messages, but the tutorials I have found online have not helped a lot. Can anyone point me in the right direction and give me some examples?.

我想用C#或Java语法编写它,因为它使我更容易建立联系.

I would like to write it in C# or Java syntax, since it is easier for me to relate.

public void S() {
    if (currentToken == "if") {
        getNextToken();
        E();

        if (currentToken == "then") {
            getNextToken();
            S();

            if (currentToken == "else") {
                getNextToken();
                S();
                Return;
            }
        } else {
            throw new IllegalTokenException("Procedure S() expected a 'then' token " + "but received: " + currentToken);
        } else if (currentToken == "begin") {
            getNextToken();
            S();
            L();
            return;
        } else if (currentToken == "print") {
            getNextToken();
            E();
            return;
        } else {
            throw new IllegalTokenException("Procedure S() expected an 'if' or 'then' or else or begin or print  token " + "but received: " + currentToken);
        }
    }
}


public void L() {
    if (currentToken == "end") {
        getNextToken();
        return;
    } else if (currentToken == ";") {
        getNextToken();
        S();
        L();
        return;
    } else {
        throw new IllegalTokenException("Procedure L() expected an 'end' or ';' token " + "but received: " + currentToken);
    }
}


public void E() {
    if (currentToken == "i") {
        getNextToken();
        return;
    } else {
        throw new IllegalTokenException("Procedure E() expected an 'i' token " + "but received: " + currentToken);
    }
}

推荐答案

基本上以递归下降的方式将语法中的每个非终结符解析为一个过程,然后在每个过程中检查您要查找的当前标记at匹配您希望在与该过程相对应的非终结符右侧看到的内容,如果确实如此,则继续应用生产;如果没有,那么您遇到错误,必须采取一些措施.

Basically in recursive descent parsing each non-terminal in the grammar is translated into a procedure, then inside each procedure you check to see if the current token you are looking at matches what you would expect to see on the right hand side of the non-terminal symbol corresponding to the procedure, if it does then continue applying the production, if it doesn't then you have encountered an error and must take some action.

因此,在您如上所述的情况下,您将具有以下步骤:S()L()E(),我将举一个示例说明如何实现L(),然后您可以尝试做S()E()自己.

So in your case as you mentioned above you will have procedures: S(), L(), and E(), I will give an example of how L() could be implemented and then you can try and do S() and E() on your own.

请务必注意,您将需要其他程序来为输入标记化标记,然后您可以向该程序询问输入中的下一个标记.

It is also important to note that you will need some other program to tokenize the input for you, then you can just ask that program for the next token from your input.

/**
 * This procedure corresponds to the L non-terminal
 * L -> 'end'
 * L -> ';' S L
 */ 
public void L()
{
   if(currentToken == 'end')
   {
      //we found an 'end' token, get the next token in the input stream
      //Notice, there are no other non-terminals or terminals after 
      //'end' in this production so all we do now is return
      //Note: that here we return to the procedure that called L()
      getNextToken();
      return; 
   } 
   else if(currentToken == ';')
   {
      //we found an ';', get the next token in the input stream
      getNextToken();
      //Notice that there are two more non-terminal symbols after ';'
      //in this production, S and L; all we have to do is call those
      //procedures in the order they appear in the production, then return
      S();
      L();
      return;
   }
   else
   {
      //The currentToken is not either an 'end' token or a ';' token 
      //This is an error, raise some exception
      throw new IllegalTokenException(
          "Procedure L() expected an 'end' or ';' token "+
          "but received: " + currentToken);
   }
}

现在您尝试S()E(),然后回发.

Now you try S() and E(), and post back.

正如克里斯托弗(Kristopher)所指出的那样,您的语法还有另外一种悬空的含义,这意味着您所产生的事物以相同的事物开始,直至某一点:

Also as Kristopher points out your grammar has something called a dangling else, meaing that you have a production that starts with the same thing up to a point:

S -> if E then S 
S -> if E then S else S

因此,这就引出了一个问题,如果您的解析器看到一个'if'令牌,它应该选择哪个生产来处理输入?答案是不知道该选择哪一个,因为与人类不同,编译器无法向前看输入流以搜索其他"令牌.这是一个简单的问题,可以通过应用称为左因子"的规则来解决,该规则与您考虑代数问题的方式非常相似.

So this begs the question if your parser sees an 'if' token, which production should it choose to process the input? The answer is it has no idea which one to choose because unlike humans the compiler can't look ahead into the input stream to search for an 'else' token. This is a simple problem to fix by applying a rule known as Left-Factoring, very similar to how you would factor an algebra problem.

您要做的就是创建一个新的非终端符号S'(S-素数),其右手侧将容纳不常见的作品,因此您的S作品no变为:

All you have to do is create a new non-terminal symbol S' (S-prime) whose right hand side will hold the pieces of the productions that aren't common, so your S productions no becomes:

S  -> if E then S S'
S' -> else S 
S' -> e   
(e is used here to denote the empty string, which basically means there is no   
 input seen)

这篇关于递归下降解析器实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆