如何让 IDLE 接受 Unicode 字符的粘贴? [英] How to get IDLE to accept paste of Unicode characters?

查看:80
本文介绍了如何让 IDLE 接受 Unicode 字符的粘贴?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在 IDLE 中以交互方式工作时,我经常想将 Unicode 字符串粘贴到 IDLE 窗口中.它似乎正确粘贴,但立即生成错误.在输出上显示相同的字符没有问题.

<预><代码>>>>c = u'ĉ'输入中不支持的字符>>>打印你'\u0109'C

我怀疑输入窗口和大多数 Windows 程序一样,在内部使用 UTF-16,处理完整的 Unicode 集没有问题;问题是 IDLE 坚持将所有输入强制转换为默认的 mbcs 代码页,并且任何不在该页面中的内容都会被拒绝.

有没有办法配置或哄骗 IDLE 接受完整的 Unicode 字符集作为输入?

Python 3.2 处理得更好,我扔的任何东西都没有问题.

我知道我可以简单地将代码保存到 UTF-8 格式的文件中并导入它,但我希望能够在交互窗口中使用 Unicode 字符.

解决方案

我终于想出了一个办法.由于 IDLE 的源是分发的一部分,您可以进行一些快速编辑以启用该功能.这些文件通常位于 C:\Python27\Lib\idlelib.

第一步是防止 IDLE 尝试将所有这些好的 Unicode 字符编码为无法处理它们的字符集.这是由 IOBinding.py 控制的.编辑文件,找到 if sys.platform == 'win32': 之后的部分并注释掉这一行:

#encoding = locale.getdefaultlocale()[1]

现在在它后面添加这一行:

encoding = 'utf-8'

我希望有一种方法可以使用环境变量或其他东西来覆盖它,但是 getdefaultlocale 直接调用 Win32 函数,该函数获取全局配置的 Windows mbcs 编码.

这已经成功了一半,现在我们需要让命令行解释器识别输入的字节是 UTF-8 编码的.似乎没有办法将编码传递给解释器,所以我想出了所有黑客之母.也许有更多耐心的人可以想出更好的方法,但这暂时有效.输入在 PyShell.py 中处理,在 runsource 函数中.更改以下内容:

 if isinstance(source, types.UnicodeType):从 idlelib 导入 IOBinding尝试:source = source.encode(IOBinding.encoding)除了 UnicodeError:self.tkconsole.resetoutput()self.write("输入中不支持的字符\n")返回

致:

 from idlelib import IOBinding # 行移动if isinstance(source, types.UnicodeType):尝试:source = source.encode(IOBinding.encoding)除了 UnicodeError:self.tkconsole.resetoutput()self.write("输入中不支持的字符\n")返回source = "#coding=%s\n%s" % (IOBinding.encoding, source) # 添加行

我们正在利用 PEP 263 来指定编码提供给解释器的每一行输入.

更新:在 Python 2.7.10 中不再需要在 PyShell.py 中进行更改,如果编码设置为utf-8.不幸的是,我还没有找到绕过 IOBinding.py 更改的方法.

Oftentimes when I'm working interactively in IDLE, I'd like to paste a Unicode string into the IDLE window. It appears to paste properly but generates an error immediately. It has no trouble displaying the same character on output.

>>> c = u'ĉ'
Unsupported characters in input

>>> print u'\u0109'
ĉ

I suspect that the input window, like most Windows programs, uses UTF-16 internally and has no trouble dealing with the full Unicode set; the problem is that IDLE insists on coercing all input to the default mbcs code page, and anything not in that page gets rejected.

Is there any way to configure or cajole IDLE into accepting the full Unicode character set as input?

Python 3.2 handles this much better and has no trouble with anything I throw at it.

I know that I can simply save the code to a file in UTF-8 and import it, but I want to be able to work with Unicode characters in the interactive window.

解决方案

I finally figured out a way. Since the sources to IDLE are part of the distribution you can make a couple of quick edits to enable the capability. The files will typically be found in C:\Python27\Lib\idlelib.

The first step is to prevent IDLE from trying to encode all those nice Unicode characters into a character set that can't handle them. This is controlled by IOBinding.py. Edit the file, find the section after if sys.platform == 'win32': and comment out this line:

#encoding = locale.getdefaultlocale()[1]

Now add this line after it:

encoding = 'utf-8'

I was hoping that there would be a way to override this with an environment variable or something, but getdefaultlocale calls directly into a Win32 function that gets the globally configured Windows mbcs encoding.

This is half the battle, now we need to get the command line interpreter to recognize that the input bytes are UTF-8 encoded. It didn't appear that there was a way to pass an encoding into the interpreter, so I came up with the mother of all hacks. Maybe someone with a little more patience can come up with a better way, but this works for now. The input is processed in PyShell.py, in the runsource function. Change the following:

    if isinstance(source, types.UnicodeType):
        from idlelib import IOBinding
        try:
            source = source.encode(IOBinding.encoding)
        except UnicodeError:
            self.tkconsole.resetoutput()
            self.write("Unsupported characters in input\n")
            return

To:

    from idlelib import IOBinding  # line moved
    if isinstance(source, types.UnicodeType):
        try:
            source = source.encode(IOBinding.encoding)
        except UnicodeError:
            self.tkconsole.resetoutput()
            self.write("Unsupported characters in input\n")
            return
    source = "#coding=%s\n%s" % (IOBinding.encoding, source)  # line added

We're taking advantage of PEP 263 to specify the encoding for each line of input provided to the interpreter.

Update: In Python 2.7.10 it is no longer necessary to make the change in PyShell.py, it already works properly if the encoding is set to utf-8. Unfortunately I haven't found a way to bypass the change in IOBinding.py.

这篇关于如何让 IDLE 接受 Unicode 字符的粘贴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆