python 2.x中unicode字符串的等效字符串.ascii_letters? [英] An equivalent to string.ascii_letters for unicode strings in python 2.x?

查看:48
本文介绍了python 2.x中unicode字符串的等效字符串.ascii_letters?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在标准库的string"模块中,

In the "string" module of the standard library,

string.ascii_letters ## Same as string.ascii_lowercase + string.ascii_uppercase

'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

是否有类似的常量可以包含在 unicode 中被视为字母的所有内容?

Is there a similar constant which would include everything that is considered a letter in unicode?

推荐答案

您可以使用以下命令构建自己的 Unicode 大小写常量:

You can construct your own constant of Unicode upper and lower case letters with:

import unicodedata as ud
all_unicode = ''.join(unichr(i) for i in xrange(65536))
unicode_letters = ''.join(c for c in all_unicode
                          if ud.category(c)=='Lu' or ud.category(c)=='Ll')

这使字符串长度为 2153 个字符(窄 Unicode Python 构建).对于像 letter in unicode_letters 这样的代码,使用 set 会更快:

This makes a string 2153 characters long (narrow Unicode Python build). For code like letter in unicode_letters it would be faster to use a set instead:

unicode_letters = set(unicode_letters)

这篇关于python 2.x中unicode字符串的等效字符串.ascii_letters?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆