解析器与范围和条件 [英] parser with scopes and conditionals

查看:170
本文介绍了解析器与范围和条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个C / C ++ / ...构建系统(我理解这是疯狂),我在设计我的解析器时遇到麻烦。

I'm writing a C/C++/... build system (I understand this is madness ;)), and I'm having trouble designing my parser.

我的食谱如下所示:

global
{
    SOURCE_DIRS src
    HEADER_DIRS include
    SOURCES bitwise.c \
            framing.c
    HEADERS \
            ogg/os_types.h \
            ogg/ogg.h
}
lib static ogg_static
{         
   NAME ogg
}
lib shared ogg_shared
{
    NAME ogg
}

(这是基于超级简单libogg源代码树)

(This being based on the super simple libogg source tree)

是注释, \ 是换行符,表示行继续在下一行参见QMake syntac)。 {} 是范围,如在C ++中,全局是适用于每个目标的设置。这是所有背景,而不是相关...我真的不知道如何使用我的范围。我将需要能够有多个范围,并且也是一种条件处理形式,如下所示:

# are comments, \ are "newline escapes", meaning the line continues on the next line (see QMake syntac). {} are scopes, like in C++, and global are settings that apply to every "target". This is all background, and not that relevant... I really don't know how to work with my scopes. I will need to be able to have multiple scopes, and also a form of conditional processing, in the lines of:

win32:DEFINES NO_CRT_SECURE_DEPRECATE

解析函数需要知道它处于什么级别的范围,范围增加。还有大括号(全局{全局{或如在示例中)的位置的位置的问题。

The parsing function will need to know on what level of scope it's at, and call itself whenever the scope is increased. There is also the problem with the location of the braces ( global { or global{ or as in the example).

我该如何使用标准C ++和STL?我明白这是一个很大的工作,这就是为什么我需要一个好的起点。非常感谢!

How could I go about this, using Standard C++ and STL? I understand this is a whole lot of work, and that's exactly why I need a good starting point. Thanks!

我已经是整个ifstream和内部字符串/ stringstream存储,所以我可以读每个字的字。

What I have already is the whole ifstream and internal string/stringstream storage, so I can read word per word.

推荐答案

我建议(这是或多或少是编译器教科书),你分阶段处理问题。这会打破事情,使每个阶段的问题更易于管理。

I would suggest (and this is more or less right out of the compiler textbooks) that you approach the problem in phases. This breaks things down so that the problem is much more manageable in each phase.

重点在词法阶段。你的词典阶段应该使用原始文本,并给你一系列的令牌,如词和特殊字符。词法分析阶段可以处理行连续,并根据需要处理空格或注释。通过处理空格,词法分析器可以简化您的解析器的任务:您可以编写词法分析器,使全局{全局{ ,甚至

Focus first on the lexer phase. Your lexing phase should take the raw text and give you a sequence of tokens, such as words and special characters. The lexer phase can take care of line continuations, and handle whitespace or comments as appropriate. By handling whitespace, the lexer can simplify your parser's task: you can write the lexer so that global{, global {, and even

global

{

将产生两个标记:一个代表全局,一个代表 {

will all yield two tokens: one representing global and one representing {.

还要注意的是,词法分析器可以将线和列号固定在令牌上,以便以后在遇到错误时使用。

Also note that the lexer can tack line and column numbers onto the tokens for use later if you hit errors.

一旦你有一个漂亮的令牌流,你的解析阶段。解析器应该使用令牌序列并构建抽象语法树,该语法树对文档的句法结构进行建模。在这一点上,你不应该担心 ifstream operator >> ,因为词法分析器应该

Once you've got a nice stream of tokens flowing, work on your parsing phase. The parser should take that sequence of tokens and build an abstract syntax tree, which models the syntactic structures of your document. At this point, you shouldn't be worrying about ifstream and operator>>, since the lexer should have done all that reading for you.

您已表示有兴趣在看到范围时递归调用解析函数。这当然是一个方法。你会看到,你必须重复做的设计决定是你是否真的想调用相同的解析函数递归
(允许像 global {global {... }} ,您可能希望在语法上禁止它们),或者是否要定义一个稍微(甚至是显着地)不同的语法规则集合在范围内。

You've indicated an interest in calling the parsing function recursively once you see a scope. That's certainly one way to go. As you'll see, the design decision you'll have to repeatedly make is whether you literally want to call the same parse function recursively (allowing for constructions like global { global { ... } } which you may want to disallow syntactically), or whether you want to define a slightly (or even significantly) different set of syntax rules that apply inside a scope.

一旦你发现自己不得不改变规则:关键是通过重构到函数中重用尽可能多的东西,你可以在不同的语法变体之间重用。如果你继续朝这个方向前进–使用单独的函数,代表你想要处理的不同语法块,让他们互相调用(可能递归)在需要的地方–你最终会得到我们称为一个递归下降解析器。维基百科条目有一个很好的简单示例;请参阅 http://en.wikipedia.org/wiki/Recursive_descent_parser

Once you find yourself having to vary the rules: the key is to reuse, by refactoring into functions, as much stuff as you can reuse between the different variants of syntax. If you keep heading in this direction – using separate functions that represent the different chunks of syntax you want to deal with and having them call each other (possibly recursively) where needed – you'll ultimately end up with what we call a recursive descent parser. The Wikipedia entry has got a good simple example of one; see http://en.wikipedia.org/wiki/Recursive_descent_parser .

如果你发现自己真的想深入了解词法分析器和解析器的理论和实践,我建议你得到一个好的编译器教程,帮助你。上述注释中提到的Stack Overflow主题将帮助您开始: http://stackoverflow.com/ questions / 1669 / learning-to-write-a-compiler

If you find yourself really wanting to delve deeper into the theory and practice of lexers and parsers, I do recommend you get a good solid compiler textbook to help you out. The Stack Overflow topic mentioned in the comments above will get you started: http://stackoverflow.com/questions/1669/learning-to-write-a-compiler

这篇关于解析器与范围和条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆