允许非ASCII标识符(Fran?ois Pinard) [英] Allowing non-ASCII identifiers (Fran?ois Pinard)

查看:75
本文介绍了允许非ASCII标识符(Fran?ois Pinard)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是python-dev邮件列表上更长篇文章的摘录。

我在这里做出回应,以避免混乱python-dev。

[Fran?ois Pinard]

< snip>

一些英国读者可能无法想象,但这是一个不变的痛苦,在用英语以外的语言进行记录和思考时,不得不破坏标识符,仅仅因为字母的Python概念仅限于英文子集。当然,关键字和标准库使用英语,这是Python,这里没有利害关系!
然而,本地(或内部)程序中有很多代码
这被认为是我们精心设计的代码,甚至语言上的改变对我们来说也是有用的(对我们来说)来区分来自语言的东西和来自我们的东西。这个想法非常具有吸引力,能够制作和完善我们的代码(评论,字符串,标识符),使其成为
<尽管它可以得到的很好,同时用我们原生的自然语言思考。 -
Fran?ois Pinard http://www.iro.umontreal .ca / ~pinard



< / snip>


Monglot和我一样的英语使用者也可能从阅读中受益

精心设计的Python代码,包含非英语标识符和注释。我通过在熟悉的环境中锚定新想法来学习




我的一个(非程序员)朋友正在通过工作来提高他的法语水平>
通过法国版本的哈利波特小说。

解决方案

Doug Fort写道:

[弗朗索瓦·皮纳德]
< snip>

一些英国读者可能无法想象,但这是一个不断的痛苦,不得不破坏用英语以外的语言记录和思考的标识符,仅仅因为字母的Python概念仅限于英文子集。当然,关键字和标准库使用英语,这是Python,这里没有利害关系!
然而,本地(或内部)程序中有很多代码
这被认为是我们精心设计的代码,甚至语言上的改变对我们来说也是有用的(对我们来说)来区分来自语言的东西和来自我们的东西。这个想法极具吸引力,能够制作和修饰我们的代码(注释,字符串,标识符),使其成为




我如果它被限制为8位编码(我们以前称之为代码页),那么这个提案是否会更加可口。这至少是b $ b b朝着正确的方向迈出的第一步,这将有助于西方人,即使没有Unicode支持编译Python,也可以使用b $ b。

(仍然可以在没有Unicode的情况下编译Python而不是吗?)


Paul Prescod


保罗·普雷斯科德写道:

我想知道如果将提案限制在8位编码(我们以前称之为代码页)。这至少是朝着正确方向迈出的第一步,这将有助于西方人,即使Python在没有Unicode支持的情况下编译也可以工作。
(仍然可以编译Python没有Unicode不是吗?)




我怀疑这对那些目前反对的人来说很重要;我知道

*我*会反对这样的策略:允许任意来源

代码编码无任何技术挑战,并且限制

单字节编码是一种随意的限制。


我相信Guido'的关注更多的是如何调用

具有??的功能在它的名字,或者?£?,或者,甚至,我怎么能

通过查看它的名称和doc来找出它的功能

字符串,如果是波兰语或希腊语?对于任何一个角色都有

a单字节编码的事实并没有真正帮助

这里。


所以这个是关于社会问题,编码政策,指导方针等 -

不是关于技术问题。


问候,

Martin


[Paul Prescod]

我想知道如果它被限制为8位,那么该提案是否会更可口编码(我们以前称之为代码页)。这是朝着正确方向迈出的第一步,这将有助于西方人,即使在没有Unicode支持的情况下编译Python也可以使其工作。


为了重复我今天早些时候写给python-dev的东西,它已经因某种意外而起作用了。一个小的主程序

可以做:


导入语言环境

locale.setlocale(locale.LC_ALL,'''')

导入THE-REAL-APPLICATION


激活您的代码页,因为您的环境已经设置为

。这将激活< ctype.c>

中字符的正确分类,然后,Python似乎在导入的应用程序中使用非ASCII标识符

正常运行。 />

这是一次意外,因为Guido并不是这个意思,至少我知道这至少是b $ b。诀窍可能会在不同的地方打破,谁知道。

我没有认真测试它,也不打算依赖它,因为Guido

甚至可能选择考虑这是一个需要纠正的错误。


计划似乎是支持非ASCII标识符而不是简单地支持非ASCII标识符,如果Python曾经这样做的话,或根本没有。尚未采取

决定,Guido首先想要PEP和讨论




根据我的经验,这样的讨论通常很粗糙(或至少要求b $ b b),因为人们对语言问题有很多情感,并不总是表现出情感与<理性化,有时会让人费解。

(仍然可以编译没有Unicode的Python而不是吗?)



我猜想如果你想要编解码器工作,那么Python中的Unicode是非常重要的,特别是对于Python目前支持的所有代码页。


-

Fran?ois Pinard http:/ /www.iro.umontreal.ca/~pinard


This is an excerpt from a much longer post on the python-dev mailing list.
I''m responding here, to avoid cluttering up python-dev.

[Fran?ois Pinard]
<snip>

Some English readers might not really imagine, but it is a constant
misery, having to mangle identifiers while documenting and thinking
in languages other than English, merely because the Python notion of
letter is limited to the English subset. Granted, keywords and standard
library use English, this is Python, and this is not at stake here!
However, there is a good part of code in local (or in-house) programs
which is thought as our crafted code, and even the linguistic change is
useful (to us) for segregating between what comes from the language and
what comes from us. The idea is extremely appealing of being able to
craft and polish our code (comments, strings, identifiers) to make it as <nice as it could get, while thinking in our native, natural language.--
Fran?ois Pinard http://www.iro.umontreal.ca/~pinard


</snip>

Monglot English speakers, like me, might also benefit from reading
well-crafted Python code with non-english identifiers and comments. I learn
best by anchoring new ideas in a familiar context.

One of my (non-programmer) friends is improving his French by working
through the French versions of the Harry Potter novels.

解决方案

Doug Fort wrote:

[Fran?ois Pinard]
<snip>

Some English readers might not really imagine, but it is a constant
misery, having to mangle identifiers while documenting and thinking
in languages other than English, merely because the Python notion of
letter is limited to the English subset. Granted, keywords and standard
library use English, this is Python, and this is not at stake here!
However, there is a good part of code in local (or in-house) programs
which is thought as our crafted code, and even the linguistic change is
useful (to us) for segregating between what comes from the language and
what comes from us. The idea is extremely appealing of being able to
craft and polish our code (comments, strings, identifiers) to make it as



I wonder if the proposal would be more palatable if it were restricted
to 8-bit encodings (what we used to call "code pages"). This is at least
a first step in the right direction that would help westerners and could
be made to work even if Python were compiled without Unicode support.
(it is still possible to compile Python without Unicode isn''t it?)

Paul Prescod


Paul Prescod wrote:

I wonder if the proposal would be more palatable if it were restricted
to 8-bit encodings (what we used to call "code pages"). This is at least
a first step in the right direction that would help westerners and could
be made to work even if Python were compiled without Unicode support.
(it is still possible to compile Python without Unicode isn''t it?)



I doubt that it would matter much to those currently opposed; I know
that *I* would be opposed to such a strategy: Allowing arbitrary source
code encoding is no technical challenge whatsoever, and restricting
it to single-byte encodings is an arbitrary restriction.

I believe Guido''s concern is more along the lines "How do I call a
function that has a ?? in its name, or a ?£?", or, even, "How can I
find out what the function does, by looking at its name and doc
string, if that is in Polish or Greek?" The fact that there is
a single-byte encoding for either character doesn''t really help
here.

So this is about social issues, coding policies, guidelines, etc -
not about technical issues.

Regards,
Martin


[Paul Prescod]

I wonder if the proposal would be more palatable if it were restricted
to 8-bit encodings (what we used to call "code pages"). This is at
least a first step in the right direction that would help westerners
and could be made to work even if Python were compiled without Unicode
support.
To repeat something I was writing to python-dev earlier today, it
already works by some kind of accident. A smallish main program
could do:

import locale
locale.setlocale(locale.LC_ALL, '''')
import THE-REAL-APPLICATION

to activate your code page, given your environment is already set for
it. This will activate proper classification of characters in <ctype.c>
and then, Python seems to behave properly with non-ASCII identifiers
within the imported application.

It is an accident because it was not meant this way by Guido, at least
so far that I know. The trick might break at various places, who knows.
I did not test it seriously, and do not intend to rely on it, as Guido
might even choose to consider this as a bug to be corrected.

The plan rather seems to be to support non-ASCII identifiers widely
instead of parsimoniously, if Python ever does it, or not at all. The
decision has not been taken yet, Guido wants a PEP and a discussion
first.

In my experience, such discussions are often rough (or at least
demanding), because people have a lot of emotions on linguistic
issues, and do not always show the real relations between emotions and
rationalisations, which sometimes get convoluted.
(it is still possible to compile Python without Unicode isn''t it?)



I would guess that Unicode in Python is central if you want codecs to
work, in particular for all code pages which Python currently supports.

--
Fran?ois Pinard http://www.iro.umontreal.ca/~pinard


这篇关于允许非ASCII标识符(Fran?ois Pinard)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆