REPL使用Flex /野牛间preTER [英] REPL for interpreter using Flex/Bison

查看:181
本文介绍了REPL使用Flex /野牛间preTER的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写一个类C语言的跨preTER,使用Flex和Bison扫描仪/解析器。它执行完整的程序文件时,工作正常。

现在我正在努力实现跨preTER交互式使用REPL。我希望它像Ruby或ML命令行间preters工作:


  1. 显示提示

  2. 接受就行了一个或多个语句

  3. 如果除权pression是不完整

    1. 显示一个提示延续

    2. 允许用户继续输入线


  4. 当行了一个完整的前pression结束

    1. 呼应评估前最后pression的结果

    2. 显示主提示


我的语法开始与 TOP_LEVEL 生产,从而重新presents在语言一条语句。词法分析器配置为标准输入交互模式。我使用两个完整文件和REPL模式是相同的扫描仪和语法,因为有两个接口没有语义差别。

我的主要评价循环的结构是这样的。

 而(!间preter.done){
    如果(除preter.repl)
        的printf(提示);
    INT状态= yyparse(除preTER);
    如果(状态){
        如果(除preter.error)
            的Report_Error(除preTER);
    }
    其他{
        如果(除preter.repl)
            看跌期权(除preter.result);
    }
}

本工程除提示和回音逻辑的罚款。如果用户输入一行多个语句,这个循环打印出多余的提示和前pressions。而如果前pression继续在多条线路,这code没有打印出连续提示。出现这些问题,因为提示/回音逻辑的粒度是语法一个 TOP_LEVEL 语句,但该行读逻辑是在词法分析器深。

什么是重组的评价循环处理REPL提示和呼应的最佳方式?这就是:


  • 我怎么能显示每行一个提示

  • 我怎么能在正确的时间显示持续提示

  • 我怎么能知道当一个完整的前pression是行上的最后一个

(我宁可不更改扫描仪的语言来传递新行标记,因为这将严重改变语法。修改 YY_INPUT ,并加入了一些行动,野牛语法就可以了。另外,我使用的以X code出货股票的Flex 2.5.35和Bison 2.3)。


解决方案

看着像Python和SML / NJ语言如何处理他们的REPLs后,我得到了一个不错的之一,我间preTER工作。相反,在最外层解析器驱动回路具有提示/回音逻辑的,我把它放在最里面的词法分析器输入程序。操作在控制由输入程序的提示解析器和词法组标志。

我使用的是重入扫描仪,所以 yyextra 包含间preTER各层之间传递的状态。它看起来大致是这样的:

  typedef结构国米$ P $ {PTER
    字符* PS1; //提示开始发言
    字符* PS2; //提示continue语句
    字符*回声; //最后一条语句的结果显示
    BOOL EOF; //由解析器EOF动作设置
    字符*错误; //通过解析器的错误动作设置
    BOOL completeLine //由yyread管理
    BOOL atStart; //扫描仪前看到真正的在线可打印字符
    // ...和跨preTER所需的各种其他领域
}国米preTER;

词法分析器输入程序:

 为size_t yyread(FILE *文件的char * buf中,为size_t最大,国际preTER *间preTER)
{
    //交互式输入由yyin中== NULL信号。
    如果(文件== NULL){
        如果(除preter-> completeLine){
            如果(除preter-> atStart和放大器;&放大器;跨preter->!回声= NULL){
                的fputs(除preter->回声,标准输出);
                的fputs(\\ n,标准输出);
                免费(除preter->回声);
                除preter->回声= NULL;
            }
            的fputs(除preter-> atStart间preter-> PS1:除preter-> PS2,标准输出);
            fflush(标准输出);
        }        烧焦IBUF [MAX + 1]; //与fgets需求\\ 0一个额外的字节
        为size_t LEN = 0;
        如果(与fgets(IBUF,最大+ 1,标准输入)){
            LEN = strlen的(IBUF);
            的memcpy(BUF,IBUF,LEN);
            //显示的提示下一次,如果我们读过的全系列。
            除preter-> completeLine =(IBUF [LEN-1] =='\\ n');
        }
        否则,如果(FERROR(标准输入)){
            // TODO:传播误差值
        }
        返回LEN;
    }
    其他{//不是交互式
        为size_t的len = FREAD(BUF,1,最大值,文件);
        如果(LEN == 0安培;&安培; FERROR(文件)){
            // TODO:传播误差值
        }
        返回LEN;
    }
}

顶级间preTER回路是:

 而(除preter->!EOF){
    除preter-> atStart = YES;
    INT状态= yyparse(除preTER);
    如果(状态){
        如果(除preter->错误)
            的Report_Error(除preTER);
    }
    其他{
        exec_statement(除preTER);
        如果(交互式)
            除preter->回声= result_string(除preTER);
    }
}

Flex的文件中获取这些新的定义:

 选项%额外型=国米preTER *的#define YY_INPUT(BUF,结果,MAX_SIZE)结果= yyread(yyin中,BUF,MAX_SIZE,yyextra)#定义YY_USER_ACTION如果{yyextra-&GT(isspace为(* yytext中)!); atStart = NO; }

YY_USER_ACTION 处理中的语言的语法标记和输入线之间的微妙互动。我的语言如C和ML在一个特殊字符(';')来结束发言。在输入流,该字符可以后跟换行符信号端的行,或它可以遵循由是一个新的语句的字符。在输入程序需要显示主提示,如果因为唯一的字符扫描的最后一条语句结束是新行或其他的空白;否则它应该显示的延续提示。

I've written an interpreter for a C-like language, using Flex and Bison for the scanner/parser. It's working fine when executing full program files.

Now I'm trying implement a REPL in the interpreter for interactive use. I want it to work like the command line interpreters in Ruby or ML:

  1. Show a prompt
  2. Accept one or more statements on the line
  3. If the expression is incomplete

    1. display a continuation prompt
    2. allow the user to continue entering lines

  4. When the line ends with a complete expression

    1. echo the result of evaluating the last expression
    2. show the main prompt

My grammar starts with a top_level production, which represents a single statement in the language. The lexer is configured for interactive mode on stdin. I am using the same scanner and grammar in both full-file and REPL modes, because there's no semantic difference in the two interfaces.

My main evaluation loop is structured like this.

while (!interpreter.done) {
    if (interpreter.repl)
        printf(prompt);
    int status = yyparse(interpreter);
    if (status) {
        if (interpreter.error)
            report_error(interpreter);
    }
    else {
        if (interpreter.repl)
            puts(interpreter.result);
    }
}            

This works fine except for the prompt and echo logic. If the user enters multiple statements on a line, this loop prints out superfluous prompts and expressions. And if the expression continues on multiple lines, this code doesn't print out continuation prompts. These problems occur because the granularity of the prompt/echo logic is a top_level statement in the grammar, but the line-reading logic is deep in the lexer.

What's the best way to restructure the evaluation loop to handle the REPL prompting and echoing? That is:

  • how can I display one prompt per line
  • how can I display the continuation prompt at the right time
  • how can I tell when a complete expression is the last one on a line

(I'd rather not change the scanner language to pass newline tokens, since that will severely alter the grammar. Modifying YY_INPUT and adding a few actions to the Bison grammar would be fine. Also, I'm using the stock Flex 2.5.35 and Bison 2.3 that ship with Xcode.)

解决方案

After looking at how languages like Python and SML/NJ handle their REPLs, I got a nice one working in my interpreter. Instead of having the prompt/echo logic in the outermost parser driver loop, I put it in the innermost lexer input routine. Actions in the parser and lexer set flags that control the prompting by input routine.

I'm using a reentrant scanner, so yyextra contains the state passed between the layers of the interpreter. It looks roughly like this:

typedef struct Interpreter {
    char* ps1; // prompt to start statement
    char* ps2; // prompt to continue statement
    char* echo; // result of last statement to display
    BOOL eof; // set by the EOF action in the parser
    char* error; // set by the error action in the parser
    BOOL completeLine // managed by yyread
    BOOL atStart; // true before scanner sees printable chars on line
    // ... and various other fields needed by the interpreter
} Interpreter;

The lexer input routine:

size_t yyread(FILE* file, char* buf, size_t max, Interpreter* interpreter)
{
    // Interactive input is signaled by yyin==NULL.
    if (file == NULL) {
        if (interpreter->completeLine) {
            if (interpreter->atStart && interpreter->echo != NULL) {
                fputs(interpreter->echo, stdout);
                fputs("\n", stdout);
                free(interpreter->echo);
                interpreter->echo = NULL;
            }
            fputs(interpreter->atStart ? interpreter->ps1 : interpreter->ps2, stdout);
            fflush(stdout);
        }

        char ibuf[max+1]; // fgets needs an extra byte for \0
        size_t len = 0;
        if (fgets(ibuf, max+1, stdin)) {
            len = strlen(ibuf);
            memcpy(buf, ibuf, len);
            // Show the prompt next time if we've read a full line.
            interpreter->completeLine = (ibuf[len-1] == '\n');
        }
        else if (ferror(stdin)) {
            // TODO: propagate error value
        }
        return len;
    }
    else { // not interactive
        size_t len = fread(buf, 1, max, file);
        if (len == 0 && ferror(file)) {
            // TODO: propagate error value
        }
        return len;
    }
}

The top level interpreter loop becomes:

while (!interpreter->eof) {
    interpreter->atStart = YES;
    int status = yyparse(interpreter);
    if (status) {
        if (interpreter->error)
            report_error(interpreter);
    }
    else {
        exec_statement(interpreter);
        if (interactive)
            interpreter->echo = result_string(interpreter);
    }
}

The Flex file gets these new definitions:

%option extra-type="Interpreter*"

#define YY_INPUT(buf, result, max_size) result = yyread(yyin, buf, max_size, yyextra)

#define YY_USER_ACTION  if (!isspace(*yytext)) { yyextra->atStart = NO; }

The YY_USER_ACTION handles the tricky interplay between tokens in the language grammar and lines of input. My language is like C and ML in that a special character (';') is required to end a statement. In the input stream, that character can either be followed by a newline character to signal end-of-line, or it can be followed by characters that are part of a new statement. The input routine needs to show the main prompt if the only characters scanned since the last end-of-statement are newlines or other whitespace; otherwise it should show the continuation prompt.

这篇关于REPL使用Flex /野牛间preTER的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆