JavaCC引用转义字符 [英] JavaCC quote with escape character

查看:221
本文介绍了JavaCC引用转义字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以包含转义字符的引用字符串的标记方法是什么?以下是一些例子:

What is the usual way of tokenizing quoted strings that can contain an escape character? Here are some examples:

1) "this is good"
2) "this is\"good\""
3) "this \is good"
4) "this is bad\"
5) "this is \\"bad"
6) "this is bad
7)  this is bad"
8)  this is bad

以下是一个示例解析器,它不能正常工作;除了成功解析的示例4和5之外,所有结果都是预期的。

Below is a sample parser that doesn't work quite right; it has expected results for all except examples 4 and 5, which parse successfully.

options
{
  LOOKAHEAD = 3;
  CHOICE_AMBIGUITY_CHECK = 2;
  OTHER_AMBIGUITY_CHECK = 1;
  STATIC = false;
  DEBUG_PARSER = false;
  DEBUG_LOOKAHEAD = false;
  DEBUG_TOKEN_MANAGER = true;
  ERROR_REPORTING = true;
  JAVA_UNICODE_ESCAPE = false;
  UNICODE_INPUT = false;
  IGNORE_CASE = false;
  USER_TOKEN_MANAGER = false;
  USER_CHAR_STREAM = false;
  BUILD_PARSER = true;
  BUILD_TOKEN_MANAGER = true;
  SANITY_CHECK = true;
  FORCE_LA_CHECK = true;
}

PARSER_BEGIN(MyParser)
import java.io.ByteArrayInputStream;
import java.io.UnsupportedEncodingException;
public class MyParser {
    public static void main(String[] args) throws UnsupportedEncodingException, ParseException{
        //note that this conversion to an input stream is only good for small strings
        MyParser parser = new MyParser(new ByteArrayInputStream(args[0].getBytes("UTF-8")));
        parser.enable_tracing();
        parser.myProduction();
        System.out.println("Must have worked!");
    }
}
PARSER_END(MyParser)

TOKEN:
{
<QUOTED: 
    "\"" 
    (
        "\\" ~[]    //any escaped character
        |           //or
        ~["\""]      //any non-quote character
    )* 
    "\""
>
}


void myProduction() :
{}
{
    <QUOTED>
    <EOF>
}

您可以从命令行运行MyParser,并输入一个解析输入,如果没有,它将打印必须工作!,否则会出错。

You can run MyParser from the command line with an input to parse. It will print "must have worked!" if it worked, or throw an error if it didn't.

如何将此解析器更改为在示例4和5上正确失败?

How do I change this parser to correctly fail on examples 4 and 5?

推荐答案

要修复您的常规表达,使它

To fix your regular expression, make it

TOKEN: {
<QUOTED: 
    "\"" 
    (
         "\\" ~[]     //any escaped character
    |                 //or
        ~["\"","\\"]  //any character except quote or backslash
    )* 
    "\"" > 
}

这篇关于JavaCC引用转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆