lex和yacc(符号表生成) [英] lex and yacc (symbol table generation)

查看:435
本文介绍了lex和yacc(符号表生成)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是lex和yacc以及编译器设计的新手. 我想知道在哪个阶段(词汇,句法或其他任何阶段)以及如何生成符号表?

我能否简要介绍一下y.output文件,该文件是通过给yacc提供-v选项生成的.我试图查看它,但没有得到太多信息.

除了编译器设计之外,我是否可以知道其他使用lex和yacc的应用程序.

解决方案

符号表是一种全局数据结构,可以在编译器的所有阶段/阶段/过程中使用.这意味着可以从lex和yacc生成的组件中使用/访问它.

当发现要存储在表中的令牌(例如标识符)时,通常从词法分析器访问符号表条目,它可以找到该条目并使用仅词法分析器可用的信息来更新它行号和字符位置,还可以存储lexeme值(如果尚不存在).现在可以在令牌的lval中返回符号表指针.

有些人更喜欢从词法分析器返回指向词素本身的指针(作为lval),并在那里进行初始符号表访问.这具有以下优点:符号表不必对词法分析器可见,但是具有的缺点是,如上所述的词法分析器信息可能不再可与符号一起存储.它通常具有使yacc的解析器动作更加忙碌"的缺点,因为它们可能随后参与管理符号表和解析树.

符号表条目将在编译器的后续阶段中进一步更新,例如解析树的语义遍历,该语义遍历可以使用类型信息来注释符号条目并标记未声明的对象等.当可能存储或需要目标特定信息时,将在目标代码生成期间再次使用符号表;在可能检查甚至优化变量使用情况的优化期间,将再次使用符号表.

符号表是您编译器编写器为自己创建的数据结构. lex或yacc没有为您提供的功能.它是在您编写的任何代码创建时生成的!

y.output文件与符号表无关.它记录了yacc如何将上下文无关文法转换为解析表.如果语法不明确,并且想在调试语法时想知道是什么规则导致了平移/归约或归约/归约错误,那么该功能非常有用.

问题的最后一部分,这些工具有什么用途? lex是一种为状态机生成代码的工具,该状态机可以识别您指定的模式.在编写编译器时不必使用它.一种有趣的用途是处理可由状态机处理的网络协议,例如TCP/IP数据报等.类似地,yacc用于上下文无关文法描述的匹配序列中.这些不一定是程序,而可以是符号,字段或数据项的其他复杂序列.它们通常只是文本,这就是该工具的正统用法.

问题的这些部分听起来确实像是有人为参加过编译器课程的学生写的那种考试问题!

I am new to lex and yacc and compiler design. I would like to know at which phase(lexical, syntactical or any other phase) and how the symbol table is generated?

Can I have a brief description of y.output file which is generated by giving -v option to yacc.I tried to looking into it but didn't get much info.

Could I know the other applications where lex and yacc are used apart from compiler designs.

解决方案

A symbol table is a global data structure that can be used in all stages/phases/passes of a compiler. This means that it can be used/accessed from both the lex and yacc generated components.

It is conventional to access the symbol table entry from the lexical analyser when it finds a token that would be stored in the table, such as an identifier, it can find the entry and update it with information only available to the lexer like line number and character position and it can also store the lexeme value if it is not already there. The symbol table pointer can now be returned in the lval of the token.

Some people prefer to return a pointer to the lexeme itself (as the lval) from the lexer to the parser and do the initial symbol table access there. This has an advantage that the symbol table does not have to be visible to the lexer, but has the disadvantage that lexer information as described above may no longer be available to store with the symbol. It often has the disadvantage of making the parser actions from yacc a little more "busy" as they then may be involved in managing the symbol table as well as the parse tree.

The symbol table entry will be further updated in later phases of the compiler, such as a semantic walk of the parse tree which can annotate the symbol entries with type information and flag undeclared objects and the like. The symbol table will be used again during target code generation when target specific information may be stored or needed, and again during optimisation when variables usage may be examined or even optimised away.

The symbol table is a data structure that you the compiler writer create for yourself. There is no feature of lex or yacc that does it for you. It is generated as and when any code you write creates it!

The y.output file has nothing to do with symbol tables. It is a record of how yacc converted the context free grammar into a parse table. It is useful when you have an ambiguous grammar and want to know what rules are causing the shift/reduce or reduce/reduce errors when debugging your grammar.

The last part of the question, what uses do these tool have? lex is a tool that generates code for a state machine that recognises the patterns you specified. It does not have to be used in writing compilers. One interesting use is in handling networking protocols that can be processed by a state machine, such as TCP/IP datagrams and so forth. Similarly, yacc is used in matching sequences that are described by context free grammars. These do not have to be programs, but could be other complex sequences of symbols, fields or data items. They are just normally pieces of text, and that is the orthodox use of the tool.

These parts of your question really sound like the kind of exam question that someone might write for students who have attended a course in compilers!

这篇关于lex和yacc(符号表生成)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆