C ++编译器如何区分令牌>>对于二进制运算符,对于模板 [英] How C++ compilers differentiate the token >> for binary operator, and for template
问题描述
我的疑惑是关于C ++编译器作为Clang的解析器,编译器如何处理运算符>>
以了解其何时是二进制运算符以及何时关闭模板,如: std :: vector< std :: tuple< int,double>>
,我想这是在解析器时间内完成的,因此解决该问题的更好方法是在词法上或仅使用>作为标记,并且解决语法解析器中的问题?
My doubt is about the parser of C++ compilers as Clang, how the compilers handle the operator >>
to know when it is a binary operator and when it is closing a template like: std::vector<std::tuple<int, double>>
, I imagine that is done in parser time, so the better way to solve that is on lexical or use only > as token, and solve the problem in the grammar parser?
推荐答案
实际上很简单:如果可见打开的模板支架,即使>
否则将构成>>
运算符的一部分.(这不适用于作为其他标记一部分的>
字符,例如> =
.)对C ++语法的此更改是C ++ 11的一部分,并在§13.3[temp.names]的第3段中进行了描述.
It's actually quite simple: if there is an open template bracket visible, a >
closes it, even if the >
would otherwise form part of a >>
operator. (This doesn't apply to >
characters which are part of other tokens, such as >=
.) This change to C++ syntax was part of C++11, and is described in paragraph 3 of §13.3 [temp.names].
如果>
在带括号的嵌套语法内,则打开的模板括号不可见.因此,两个 T< sizeof a [x>>中的
和>>
1]> T<(x>> 1)>
是右移运算符,而 T< x>1>
可能无法按预期进行解析.
An open template bracket is not visible if the >
is inside a parenthetically nested syntax. So the >>
in both T<sizeof a[x >> 1]>
and T<(x >> 1)>
are right shift operators, while T<x >> 1>
probably does not parse as expected.
这两种实现策略都是可行的,具体取决于您要放置复杂性的位置.如果词法分析器从不生成>>
令牌;解析器可以检查 expr'>'中的
通过查看其源位置而相邻.将存在移位减少冲突,必须解决该冲突以减少模板参数列表.之所以行之有效是因为碰巧不会通过将>
标记'>'expr >>
分成两个标记而产生歧义,但这不是一般规则: a + ++ b
与 a ++ + b
;如果词法分析器仅生成 +
令牌,那将是模棱两可的.
The two implementation strategies are both workable, depending on where you want to put the complexity. If the lexer never generates a >>
token; the parser can check that the >
tokens in expr '>' '>' expr
are adjacent by looking at their source locations. There will be a shift-reduce conflict, which will have to be resolved in favour of reducing the template parameter list. This works because it happens that there is no ambiguity created by separating >>
into two tokens, but that's not a general rule: a + ++ b
is different from a ++ + b
; if the lexer were only generating +
tokens, that would be ambiguous.
如果您准备好使用词法分析器跟踪括号深度,使用词法分析器hack解决该问题并不复杂.这意味着词法分析器必须知道<
是模板括号还是比较运算符,但很有可能确实如此.
It's not too complicated to resolve the issue with a lexer hack, if you are prepared to have your lexer track parenthesis depth. That means the lexer has to know whether a <
is a template bracket or a comparison operator, but it is quite possible that it does.
这是一个更有趣的问题(至少是恕我直言):如何将<
识别为模板括号而不是小于运算符?这里确实有语义反馈:如果它跟随指定模板的名称,则它是模板支架.
This is the more interesting question (at least imho): how is a <
recognised as a template bracket rather than a less-than operator? Here there really is semantic feedback: it is a template bracket if it follows a name which designates a template.
这不是一个简单的决定.该名称可以是类或联合的成员,甚至可以是模板化类或联合的专业化的成员.在后一种情况下,可能有必要计算编译时常量表达式的值,然后进行模板推导,以确定名称的含义.
This is not a simple determination. The name could be a class or union member, and even a member of a specialisation of a templated class or union. In the latter case, it might be necessary to calculate the values of compile-time constant expressions and then do template deduction in order to decide what the name designates.
这篇关于C ++编译器如何区分令牌>>对于二进制运算符,对于模板的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!