int文字的属性访问 [英] Attribute access on int literals

查看:113
本文介绍了int文字的属性访问的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

>>> 1 .__hash__()
1
>>> 1.__hash__()
  File "<stdin>", line 1
    1.__hash__()
             ^
SyntaxError: invalid syntax

这里已经涵盖了第二个示例不起作用的原因,因为int文字实际上被解析为浮点数.

It has been covered here before that the second example doesn't work because the int literal is actually parsed as a float.

我的问题是,当解释为浮点数是语法错误时,为什么 python将其解析为int上的属性访问? 词汇分析上的文档部分似乎仅建议使用空格其他解释不明确时需要,但也许我在本节中读错了.

My question is, why doesn't python parse this as attribute access on an int, when the interpretation as a float is a syntax error? The docs section on lexical analysis seem to suggest whitespace only required when other interpretations are ambiguous, but perhaps I'm reading this section wrong.

直觉上,词法分析器似乎很贪婪(试图获得最大的代币),但是我没有这个说法的来源.

On a hunch it seems like the lexer is greedy (trying to take the biggest token possible), but I have no source for this claim.

推荐答案

仔细阅读,说

仅当两个令牌之间的级联可以解释为一个不同的令牌(例如,ab是一个令牌,而b是两个令牌)时,才需要在两个令牌之间使用空格.

Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).

1.__hash__()被标记为:

import io, tokenize
for token in tokenize.tokenize(io.BytesIO(b"1.__hash__()").read):
    print(token.string)

#>>> utf-8
#>>> 1.
#>>> __hash__
#>>> (
#>>> )
#>>>

Python的词法分析器将选择一个标记,该标记包含可能从左到右读取时形成合法令牌;解析后,不能将两个令牌合并为有效令牌.逻辑与您在其他问题中的逻辑非常相似.

Python's lexer will choose a token which comprises the longest possible string that forms a legal token, when read from left to right; after parsing no two tokens should be able to be combined into a valid token. The logic is very similar to that in your other question.

这种困惑似乎并未将标记化步骤识别为一个完全不同的步骤.如果语法仅允许分割令牌以使解析器满意,那么您肯定会期望

The confusion seems to be not recognizing the tokenizing step as a completely distinct step. If the grammar allowed splitting up tokens solely to make the parser happy then surely you'd expect

_ or1.

标记为

_
or
1.

但是没有这样的规则,因此标记为

but there is no such rule, so it tokenizes as

_
or1
. 

这篇关于int文字的属性访问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆