如何在 Python 中读取 Unicode 输入并比较 Unicode 字符串? [英] How to read Unicode input and compare Unicode strings in Python?

查看:52
本文介绍了如何在 Python 中读取 Unicode 输入并比较 Unicode 字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Python 中工作,想读取 Unicode 格式的用户输入(从命令行),即 raw_input 的 Unicode 等价物?

另外,我想测试 Unicode 字符串的相等性,看起来标准的 == 不起作用.

解决方案

raw_input() 返回由操作系统或 UI 设施编码的字符串.困难在于知道哪个是解码.您可以尝试以下操作:

import sys, localetext= raw_input().decode(sys.stdin.encoding 或 locale.getpreferredencoding(True))

在大多数情况下应该可以正常工作.

我们需要更多有关无法进行 Unicode 比较的数据才能为您提供帮助.但是,这可能是规范化的问题.考虑以下几点:

<预><代码>>>>a1=你'\xeatre'>>>a2= u'e\u0302tre'

a1a2 等价但不等价:

<预><代码>>>>打印 a1、a2être être>>>打印 a1 == a2错误的

所以你可能想使用 unicodedata.normalize() 方法:

<预><代码>>>>将 unicodedata 导入为 ud>>>ud.normalize('NFC', a1)你'\xeatre'>>>ud.normalize('NFC', a2)你'\xeatre'>>>ud.normalize('NFC', a1) == ud.normalize('NFC', a2)真的

如果您向我们提供更多信息,我们或许可以为您提供更多帮助.

I work in Python and would like to read user input (from command line) in Unicode format, ie a Unicode equivalent of raw_input?

Also, I would like to test Unicode strings for equality and it looks like a standard == does not work.

解决方案

raw_input() returns strings as encoded by the OS or UI facilities. The difficulty is knowing which is that decoding. You might attempt the following:

import sys, locale
text= raw_input().decode(sys.stdin.encoding or locale.getpreferredencoding(True))

which should work correctly in most of the cases.

We need more data about not working Unicode comparisons in order to help you. However, it might be a matter of normalization. Consider the following:

>>> a1= u'\xeatre'
>>> a2= u'e\u0302tre'

a1 and a2 are equivalent but not equal:

>>> print a1, a2
être être
>>> print a1 == a2
False

So you might want to use the unicodedata.normalize() method:

>>> import unicodedata as ud
>>> ud.normalize('NFC', a1)
u'\xeatre'
>>> ud.normalize('NFC', a2)
u'\xeatre'
>>> ud.normalize('NFC', a1) == ud.normalize('NFC', a2)
True

If you give us more information, we might be able to help you more, though.

这篇关于如何在 Python 中读取 Unicode 输入并比较 Unicode 字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆