与解析有关,什么是令牌? [英] what exactly is a token, in relation to parsing

查看:118
本文介绍了与解析有关,什么是令牌?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须在c ++中使用解析器和编写器,我正在尝试实现这些功能,但是我不理解令牌是什么.我的一项功能/操作是检查是否还有更多令牌可以产生

I have to use a parser and writer in c++, i am trying to implement the functions, however i do not understand what a token is. one of my function/operations is to check to see if there are more tokens to produce

bool Parser :: hasMoreTokens()

bool Parser::hasMoreTokens()

我该怎么做,请帮忙

所以!

我正在打开一个包含文本的文本文件,所有单词均小写.我该如何检查它是否还有更多令牌?

I am opening a text file with text in it, all words are lowercased. How do i go about checking to see if it hasmoretokens?

这就是我所拥有的

bool Parser::hasMoreTokens() {

while(source.peek()!=NULL){
    return true;
}
    return false;
}

推荐答案

标记是词法分析的输出和解析的输入.通常情况是

Tokens are the output of lexical analysis and the input to parsing. Typically they are things like

  • 数字
  • 变量名
  • 括号
  • 算术运算符
  • 语句终止符

粗略地说,可以通过仅一次查看其输入一个字符的代码来明确标识最大的事物.

That is, roughly, the biggest things that can be unambiguously identified by code that just looks at its input one character at a time.

一个注释,如果使您感到困惑,则可以随意忽略:词法分析和解析之间的界限有点​​模糊.例如:

One note, which you should feel free to ignore if it confuses you: The boundary between lexical analysis and parsing is a little fuzzy. For instance:

  1. 某些编程语言具有类似于2+3i3.2e8-17e6i的复数文字.如果您正在解析这样的语言,则可以使词法分析器吞噬整个复数并将其变成令牌.或者您可以拥有一个更简单的词法分析器和一个更复杂的解析器,并将(例如)3.2e8-17e6i设为单独的标记;然后,解析器的工作(甚至代码生成器的工作)将注意到它得到的实际上是单个文字.

  1. Some programming languages have complex-number literals that look, say, like 2+3i or 3.2e8-17e6i. If you were parsing such a language, you could make the lexer gobble up a whole complex number and make it into a token; or you could have a simpler lexer and a more complicated parser, and make (say) 3.2e8, -, 17e6i be separate tokens; it would then be the parser's job (or even the code generator's) to notice that what it's got is really a single literal.

在某些编程语言中,词法分析器可能无法分辨给定的标记是变量名还是类型名. (例如,这发生在C语言中.)但是语言的语法可能会区分两者,因此您希望"variable foo"和"type name foo"成为不同的标记. (这也发生在C语言中.)在这种情况下,可能有必要将某些信息从解析器反馈到词法分析器,以便在每种情况下都可以生成正确的令牌.

In some programming languages, the lexer may not be able to tell whether a given token is a variable name or a type name. (This happens in C, for instance.) But the grammar of the language may distinguish between the two, so that you'd like "variable foo" and "type name foo" to be different tokens. (This also happens in C.) In this case, it may be necessary for some information to be fed back from the parser to the lexer so that it can produce the right sort of token in each case.

那么什么确切地是一个令牌?"可能并不总是有一个定义明确的答案.

So "what exactly is a token?" may not always have a perfectly well defined answer.

这篇关于与解析有关,什么是令牌?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆