使用 Flex 检测和跳过行注释 [英] Detecting and skipping line comments with Flex

查看:20
本文介绍了使用 Flex 检测和跳过行注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 Flex 中检测像 // 这样的一行注释并跳过这些行?

How can I detect one line comments like // in Flex and skip those lines?

另外,对于 /* 注释,以下代码段是否足够?

Also, for /* comments, will the following snippet be enough?

   "/*"         { comment(); }
  %%

    comment()
    {
        char c, c1;
    loop:
        while ((c = input()) != '*' && c != 0)
            putchar(c);

        if ((c1 = input()) != '/' && c != 0)
        {
            unput(c1);
            goto loop;
        }

        if (c != 0)
            putchar(c1);
    }

推荐答案

为什么不用正则表达式来识别评论呢?lex/flex 的全部意义在于让您不必手动编写词法扫描器.你提供的代码应该可以工作(如果你把模式 /* 放在行首),但是它有点难看,而且不明显它会工作.

Why don't you just use regular expressions to recognize the comments? The whole point of lex/flex is to save you from having to write lexical scanners by hand. The code you present should work (if you put the pattern /* at the beginning of the line), but it's a bit ugly, and it is not obvious that it will work.

您的问题说您想跳过注释,但您提供的代码使用 putchar() 打印注释,但开头的 /* 除外.你想做什么?如果你想回应评论,你可以使用 ECHO 动作而不是什么都不做.

Your question says that you want to skip comments, but the code you provide uses putchar() to print the comment, except for the /* at the beginning. Which is it that you want to do? If you want to echo the comments, you can use an ECHO action instead of doing nothing.

下面是正则表达式:

单行注释

这很简单,因为在 lex/flex 中,. 不会匹配换行符.所以下面会从 // 匹配到行尾,然后什么都不做.

This one is easy because in lex/flex, . won't match a newline. So the following will match from // to the end of the line, and then do nothing.

"//".*                                    { /* DO NOTHING */ }

多行注释

这有点棘手,而 * 是正则表达式字符以及注释标记的关键部分这一事实使得以下正则表达式有点难以阅读.我使用 [*] 作为识别字符 * 的模式;在 flex/lex 中,您可以使用 "*" 代替.使用您认为更具可读性的任何一个.本质上,正则表达式匹配以(字符串)* 结尾的字符序列,直到找到下一个字符为/ 的字符序列.换句话说,它与您的 C 代码具有相同的逻辑.

This is a bit trickier, and the fact that * is a regular expression character as well as a key part of the comment marker makes the following regex a bit hard to read. I use [*] as a pattern which recognizes the character *; in flex/lex, you can use "*" instead. Use whichever you find more readable. Essentially, the regular expression matches sequences of characters ending with a (string of) * until it finds one where the next character is a /. In other words, it has the same logic as your C code.

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }

以上需要终止的*/;未终止的注释将迫使词法分析器备份到注释的开头并接受其他一些标记,通​​常是 / 除法运算符.这可能不是您想要的,但要从未终止的评论中恢复并不容易,因为没有真正好的方法可以知道评论应该在哪里结束.因此,我建议添加错误规则:

The above requires the terminating */; an unterminated comment will force the lexer to back up to the beginning of the comment and accept some other token, usually a / division operator. That's likely not what you want, but it's not easy to recover from an unterminated comment since there's no really good way to know where the comment should have ended. Consequently, I recommend adding an error rule:

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }
[/][*]                                    { fatal_error("Unterminated comment"); }

这篇关于使用 Flex 检测和跳过行注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆