int文字的属性访问 [英] Attribute access on int literals

查看：113 发布时间：2020/5/25 1:41:21 python parsing grammar lexical-analysis

本文介绍了int文字的属性访问的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

>>> 1 .__hash__()
1
>>> 1.__hash__()
  File "<stdin>", line 1
    1.__hash__()
             ^
SyntaxError: invalid syntax

这里已经涵盖了第二个示例不起作用的原因，因为int文字实际上被解析为浮点数.

It has been covered here before that the second example doesn't work because the int literal is actually parsed as a float.

我的问题是，当解释为浮点数是语法错误时，为什么不 python将其解析为int上的属性访问? 词汇分析上的文档部分似乎仅建议使用空格其他解释不明确时需要，但也许我在本节中读错了.

My question is, why doesn't python parse this as attribute access on an int, when the interpretation as a float is a syntax error? The docs section on lexical analysis seem to suggest whitespace only required when other interpretations are ambiguous, but perhaps I'm reading this section wrong.

直觉上，词法分析器似乎很贪婪(试图获得最大的代币)，但是我没有这个说法的来源.

On a hunch it seems like the lexer is greedy (trying to take the biggest token possible), but I have no source for this claim.

推荐答案

仔细阅读，说

仅当两个令牌之间的级联可以解释为一个不同的令牌(例如，ab是一个令牌，而b是两个令牌)时，才需要在两个令牌之间使用空格.

Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).

1.__hash__()被标记为:

import io, tokenize
for token in tokenize.tokenize(io.BytesIO(b"1.__hash__()").read):
    print(token.string)

#>>> utf-8
#>>> 1.
#>>> __hash__
#>>> (
#>>> )
#>>>

Python的词法分析器将选择一个标记，该标记包含可能从左到右读取时形成合法令牌；解析后，不能将两个令牌合并为有效令牌.逻辑与您在其他问题中的逻辑非常相似.

Python's lexer will choose a token which comprises the longest possible string that forms a legal token, when read from left to right; after parsing no two tokens should be able to be combined into a valid token. The logic is very similar to that in your other question.

这种困惑似乎并未将标记化步骤识别为一个完全不同的步骤.如果语法仅允许分割令牌以使解析器满意，那么您肯定会期望

The confusion seems to be not recognizing the tokenizing step as a completely distinct step. If the grammar allowed splitting up tokens solely to make the parser happy then surely you'd expect

_ or1.

标记为

_
or
1.

但是没有这样的规则，因此标记为

but there is no such rule, so it tokenizes as

_
or1
.

这篇关于int文字的属性访问的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

int文字的属性访问 [英] Attribute access on int literals

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

int文字的属性访问 [英] Attribute access on int literals

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭