在编译器构造中哪些值存储在Symbol表中 [英] What values are stored into Symbol table in compiler construction

查看:66
本文介绍了在编译器构造中哪些值存储在Symbol表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Aho Ullman和Sethi 的编译器构建中,它是假定源的输入字符串由扫描仪读取(词法分析)并将字符分组为有意义的序列,称为词缀,并且对于每个词素扫描器都将输出作为该形式的标记。例如下面的
<令牌名称,属性值>

In Compiler Construction by Aho Ullman and Sethi, it is given that the input string of characters of the source are read by scanner(lexical analysis) and groups characters into meaningful sequences called lexems,and for each lexeme scanner produces output as a token of the form. like below <token-name, attribute-value>

  e.g position = initial + rate * 60 

这些字符被分组为词素并映射为令牌,例如

these characters are group grouped into lexemes and mapped into tokens like


  1. 位置是词位,并映射为令牌,其< id,1>其中id是标识符的抽象符号,1指向位置的符号表条目。

  2. initial是lexeme并映射到令牌< id,2> ;,其中2指向符号表最初的

我的问题是,这些标记如何存储到符号表中?因为我们仅将词素映射到令牌中,例如< id,1> ;、< id,2> .etc等。我们在符号表中将与这些标记相对应的值存储在哪里?我知道符号表,但是,有人可以告诉我这里使用的ST的签名吗?像< id,map<令牌名称,属性值>>> 一样吗?
还用于所有 id 字段(标识符),这些字段的数据结构用于存储与标识符有关的信息,例如名称,范围,大小,数据类型。

my question is, how these tokens are stored into symbol table? as we are only mapping lexemes into tokens like <id , 1>, <id, 2>..etc. where are we storing values corresponding to these tokens in symbol table? I am aware of the symbol table but, can somebody please tell me the signature of ST which is used here? Is it something like <id, map<token-name, attribute-value>> ?? also for all id fields(identifiers) which data-structure is being used to store information related to identifiers like name, scope, size, dataType.

然后生成哪个状态ST?因为编译器设计中的所有阶段(扫描器,解析器,语义分析器等)都使用ST作为参考

And which state ST is generated? because all stages(scanner, parser, semantic analyzer etc) in compiler design uses ST for reference

另一个问题是,当解析器要求下一个输入令牌时,扫描器是否从ST读取输入令牌或从输入数据?
请帮助我了解
或属性值仅仅是包含指向符号表的指针吗?

Another question is when parser asks for next input token then does the scanner reads input token from ST or from input data? Please help me to understand or attribute-value is simply contains the pointer to the symbol table?

推荐答案

在词法扫描期间,您唯一有关符号的信息是其拼写。因此,您只能做插入符号,以避免对符号的多个动态分配名称。 (此功能的实用性在很大程度上取决于您的实现语言。)

During the lexical scan, the only information you have about a symbol is its spelling. So you can't do much more than intern the symbol to avoid multiple dynamic allocation of the symbol's name. (How useful this is depends a lot on your implementation language.)

随着分析的继续,您将积累有关每个符号的更多信息。在大多数编程语言中,相同的名称将与多个对象关联:某些关联将具有范围(局部变量),而其他关联将具有上下文(例如命名空间和聚合成员)。每个词素的确切含义都需要解决,但是即使在最初的语法解析过程中也可能不会发生。 (例如,结构成员的名称将需要与描述该结构类型的对象中的实际成员相关联,但是在解析每个表达式的类型之前,您将不知道该结构类型是什么。)

As the analysis continues, you will accumulate more information about each symbol. In most programming languages, the same name will be associated with multiple objects: some of the associations will be scoped (local variables) while others will be contextual (namespaces and aggregate members, for example). The precise meaning of each lexeme will need to be resolved, but that might not happen even during the initial syntactic parse. (For example, the name of a structure member will need to be associated with the actual member in the object which describes the structure's type, but until you've resolved the type of each expression, you won't know what the structure type is.)

因此,这个问题没有答案。编译器中可能会有很多不同的容器,这些容器将名称与一些信息集合相关联,并且它们不可能都具有相同的数据字段。在编写编译器的各个阶段时,所有这些都必须充实。

So there is no one answer to this question. There will likely be a lot of different containers in your compiler which associate a name with some collection of information, and they are not likely to all have the same data fields. All that will have to be fleshed out as you write the various phases of your compiler.

这篇关于在编译器构造中哪些值存储在Symbol表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆