python 2.x中unicode字符串的等效字符串.ascii_letters? [英] An equivalent to string.ascii_letters for unicode strings in python 2.x?
本文介绍了python 2.x中unicode字符串的等效字符串.ascii_letters?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在标准库的string"模块中,
In the "string" module of the standard library,
string.ascii_letters ## Same as string.ascii_lowercase + string.ascii_uppercase
是
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
是否有类似的常量可以包含在 unicode 中被视为字母的所有内容?
Is there a similar constant which would include everything that is considered a letter in unicode?
推荐答案
您可以使用以下命令构建自己的 Unicode 大小写常量:
You can construct your own constant of Unicode upper and lower case letters with:
import unicodedata as ud
all_unicode = ''.join(unichr(i) for i in xrange(65536))
unicode_letters = ''.join(c for c in all_unicode
if ud.category(c)=='Lu' or ud.category(c)=='Ll')
这使字符串长度为 2153 个字符(窄 Unicode Python 构建).对于像 letter in unicode_letters
这样的代码,使用 set 会更快:
This makes a string 2153 characters long (narrow Unicode Python build). For code like letter in unicode_letters
it would be faster to use a set instead:
unicode_letters = set(unicode_letters)
这篇关于python 2.x中unicode字符串的等效字符串.ascii_letters?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文