Java中的递归下降解析器 [英] Recursive Descent Parser in Java

查看:191
本文介绍了Java中的递归下降解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过说这是我第三年编程语言课的家庭作业作为序言,并且我正在寻求一些帮助.我的作业内容为:

I would like to preface this by saying this is a homework assignment for my third Year Programming Languages Class, and I'm looking for some help with it. My assignment reads:

截止日期:2013年2月22日,晚上11:55
提交:请将以下内容上传到CMS.

1.源代码
2.程序执行的屏幕截图,包括您使用的输入文件

Deadline: February 22, 2013 at 11:55pm
Submission: Please upload the following to the CMS.

1. Source code
2. A screen shot of the execution of your program including the input file you used

使用您喜欢的任何编程语言编写递归下降解析器,以解析由以下EBNF描述生成的语言.您的解析器应检测输入程序是否存在语法错误.它不必指定错误的位置和原因.

Use any programming language you prefer to write a recursive-descent parser that parses the language generated by the following EBNF descriptions. Your parser should detect whether or not the input program has any syntax errors. It does not have to specify what and where the error is.

<program>   begin <stmt_list> end
<stmt_list>  <stmt> {;<stmt_list>}
<stmt>    <assign_stmt> | <while_stmt>
<assign_stmt>  <var> = <expr>
<var>  identifier  (An identifier is a string that begins with a letter followed by 0     or more letters and digits)
<expr>  <var> { (+|-) <var>}           
<while_stmt>   while (<logic_expr>)  <stmt>
<logic_expr> ® <var> (< | >) <var>  (Assume that logic expressions have only less than     or greater than operators)

看起来很有趣的符号只是指向右边的箭头.

The symbols that look funny are just arrows pointing to the right.

目前,我的问题比编程更合乎逻辑:在我的第一次尝试中,我读取了整个输入程序,将其保存为一个字符串,然后解析该字符串并将每个符号转换为一个终端,expr,或你有什么.

My problem at the moment is more logical then it is programming: in my first attempt, I read the entire input program in, saved it to a string, then parsed that string and converted every symbol to either a terminal, expr, or what have you.

我最终发现这种方式行不通,因为,A:我不认为这是RDP,B:许多非终端都是由1条以上的语句组成的.

I eventually found that this way would not work because, A: I don't think that it is RDP, B: many of the non terminals are made of more then 1 statement.

我放弃了这种方法,并决定在浪费更多时间进行编程之前,我会伪装所有内容.我的新想法是为每个非终端符号制作1个方法,然后逐个符号解析输入的字符串符号,以在这些方法之间切换.这种方法似乎是合乎逻辑的,但是当我开始编写伪代码时,我对要做的事情感到非常迷茫和困惑. 我将如何完成此代码?

I gave up on that approach, and decided that before I waste more time programming, I would Pseudo everything out. My new idea was to make 1 method, for each non terminal symbol, and just parse the input string symbol by symbol, hoping between those methods. This approach seemed logical, but as I started writing the pseudocode I got very lost and confused as to what I needed to do. How would I finish this code?

这是RDP的一些伪代码:

Here is some pseudocode for RDP:

intputString;

public void parseProgram (Symbol.typeIsProgram) {
    if getNextSymbol == "begin" {
        if (intputString.substring (inputString.length()-3,
                inputString.length()) == "end") {
            Symbol stmt_lsit = new Symbol (intputString)
            parseStmt_list(stmt_list);              
        } else {
            Out "error, prog must end with end"
        }
    } else {
        Out "error, prog must begin with begin"
    }   
}

public void parseStmt_list (Stmbol.typeIsStmt_list) {
    symbol = getNextSymbol;
    if (Symbol.typeIsVar) {
        parseVar(symbol)
    } else if (Symbol.typeIsWhile)  {
        // weve only capture if the first word return is a while, we dont have the whole while statement yet
        ParseWhile_stmt(symbol)
    } else { }
}

public void parseStmt () { }
public void parseAssign_stmt () { }
public void parseVar () { }
public void parseExpr () { }
public void parseWhile_stmt () { }
public void parseLogic_expr () { }

public Symbol getNextSymbol() {
    //returns the next symbol in input string and removes it from the input string
}

仅供参考,我的解析器将是一个示例输入程序.

Just an FYI a sample input program for my parser would be.

begin 
total = var1 + var2; 
while (var1 < var2) 
while ( var3 > var4)
var2 = var2 - var1 
end

推荐答案

根据此特定任务,可以以功能方式使用字符串处理.

According to this particular assignment it is alright to use string processing in a functional way.

尝试以下算法:

  1. 检查,该行以begin开头并以end
  2. 结尾
  3. 如果是,请使用;拆分剩余的字符串,并对每个部分(语句)执行以下步骤
  4. 检查语句是否包含=while子字符串
  5. 对于分配检查,您可以使用+-分割输入,并为每个部分检查变量条件(字母数字内容)
  6. 一会儿-检查()并递归处理括号之间和之后的两个字符串
  7. 最后,对于用子字符串<>拆分的逻辑表达式,并检查部分是否为变量(字母数字)
  1. check, that line begins with begin and ends with end
  2. if yes - split remaining string with ; and do the following steps for each part (statement)
  3. check if statement consists a = or while substring
  4. for assignment check you can split an input with + or - and for each part check the variable condition (alphanumeric content)
  5. for while - check for ( and ) and recursively process two strings between brackets and after it
  6. finally, for logical expression split by substring < or >, and check that parts are variables (alphanumeric)

也许,这不是一个非常灵活的解决方案,但据我所知,它对于您的任务是可以接受的.

Maybe, it is not very flexible solution, but as I see it is acceptable for your task.

我发现任务很有趣,并试图编写一个完整的解决方案.

I found the task interesting and tried to write a complete solution.

public enum Parser {
PROGRAM {
    void parse(String s) throws ParseException {
        s = s.trim();
        if (s.startsWith("begin") && s.endsWith("end")) {
            STATEMENT_LIST.parse(s.substring(5, s.length() - 3));
        } else {
            throw new ParseException("Illegal begin/end");
        }
    }
},

STATEMENT_LIST {
    void parse(String s) throws ParseException {
        String[] parts = s.trim().split(";");
        for (String part : parts) {
            STATEMENT.parse(part.trim());
        }
    }
},

STATEMENT {
    void parse(String s) throws ParseException {
        if (s.startsWith("while")) {
            WHILE.parse(s.substring(5));
        } else if (s.contains("=")) {
            ASSIGNMENT.parse(s);
        } else {
            throw new ParseException("Illegal statement: " + s);
        }
    }
},

WHILE {
    void parse(String s) throws ParseException {
        int i = s.indexOf("(");
        int j = s.indexOf(")");
        if (i != -1 && j != -1) {
            LOGICAL.parse(s.substring(i + 1, j).trim());
            STATEMENT.parse(s.substring(j + 1).trim());
        } else {
            throw new ParseException("Illegal while: " + s);
        }
    }
},

ASSIGNMENT {
    void parse(String s) throws ParseException {
        int i = s.indexOf("=");
        if (i != -1) {
            VARIABLE.parse(s.substring(0, i).trim());
            EXPRESSION.parse(s.substring(i + 1).trim());
        }
    }
},

EXPRESSION {
    void parse(String s) throws ParseException {
        String[] parts = s.split("\\+|-");
        for (String part : parts) {
            VARIABLE.parse(part.trim());
        }
    }
},

LOGICAL {
    void parse(String s) throws ParseException {
        int i;
        if (s.contains("<")) {
            i = s.indexOf("<");
        } else if (s.contains(">")) {
            i = s.indexOf(">");
        } else {
            throw new ParseException("Illegal logical: " + s);
        }

        VARIABLE.parse(s.substring(0, i).trim());
        VARIABLE.parse(s.substring(i + 1).trim());
    }
},

VARIABLE {
    void parse(String s) throws ParseException {
        if (!s.matches("^[a-zA-Z][a-zA-Z0-9]*$")) {
            throw new ParseException("Illegal variable: " + s);
        }
    }
};

abstract void parse(String s) throws ParseException;

public static void main(String[] args) {
    try {
        PROGRAM.parse("begin \n" +
                "total = var1 + var2; \n" +
                "while (var1 < var2) \n" +
                "while ( var3 > var4)\n" +
                "var2 = var2 - var1 \n" +
                "end");
        System.out.println("OK");
    } catch (ParseException e) {
        System.out.println(e.getMessage());
    }
}
}

class ParseException extends Exception {
public ParseException(String message) {
    super(message);
}
}

这篇关于Java中的递归下降解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆