使用特殊规则标记类型参数中的“"吗? [英] Are ">>"s in type parameters tokenized using a special rule?

查看:74
本文介绍了使用特殊规则标记类型参数中的“"吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 Java规范感到困惑代码应标记为:

I'm confused by the Java spec about how this code should be tokenized:

ArrayList<ArrayList<Integer>> i;

规范说:

在每个步骤中都使用尽可能长的翻译,即使结果最终不能正确编写程序,而另一个词法翻译也会这样做.

The longest possible translation is used at each step, even if the result does not ultimately make a correct program while another lexical translation would.

据我了解,应用最长匹配"规则将产生令牌:

As I understand it, applying the "longest match" rule would result in the tokens:

  • ArrayList
  • <
  • ArrayList
  • <
  • 整数
  • >>
  • i
  • ;

这将无法解析.但是当然可以对这段代码进行解析.

which would not parse. But of course this code is parsed just fine.

这种情况的正确规范是什么?

What is the correct specification for this case?

这是否意味着正确的词法分析器必须与上下文无关?使用常规词法分析器似乎不可能.

Does it mean that a correct lexer must be context-free? It doesn't seem possible with a regular lexer.

推荐答案

基于阅读

Based on reading the code linked by @sm4, it looks like the strategy is:

  • 正常标记输入.因此A<B<C>> i;将被标记为A, <, B, <, C, >>, i, ;-8个令牌,而不是9.

  • tokenize the input normally. So A<B<C>> i; would be tokenized as A, <, B, <, C, >>, i, ; -- 8 tokens, not 9.

在层次分析期间,在解析泛型时需要使用>,如果下一个标记以>->>>>>>=>>=开头,或>>>=-只需敲击>并将缩短的令牌推回令牌流即可.示例:当解析器在处理typeArguments规则时进入>>, i, ;时,它成功解析了typeArguments,并且剩余的令牌流现在与>, i, ;略有不同,因为>>的第一个>被拉到了匹配typeArguments.

during hierarchical parsing, when working on parsing generics and a > is needed, if the next token starts with > -- >>, >>>, >=, >>=, or >>>= -- just knock the > off and push a shortened token back onto the token stream. Example: when the parser gets to >>, i, ; while working on the typeArguments rule, it successfully parses typeArguments, and the remaining token stream is now the slightly different >, i, ;, since the first > of >> was pulled off to match typeArguments.

因此,尽管令牌化确实会正常发生,但如有必要,在层次分析阶段会进行一些重新令牌化.

So although tokenization does happen normally, some re-tokenization occurs in the hierarchical parsing phase, if necessary.

这篇关于使用特殊规则标记类型参数中的“"吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆