将全角Unicode字符转换为ASCII字符 [英] Convert full-width Unicode characters into ASCII characters

查看：73 发布时间：2021/4/10 18:35:59 python python-2.7 unicode ascii

本文介绍了将全角Unicode字符转换为ASCII字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Unicode中有一些字符串文本，其中包含一些数字，如下所示:

I have some string text in unicode, containing some numbers as below:

txt = '３６fsdfdsf１４'

但是， int(txt [:2])不能将字符识别为数字.如何更改字符以使其识别为数字?

However, int(txt[:2]) does not recognize the characters as number. How to change the characters to have them recognized as number?

推荐答案

如果您确实拥有Unicode(或将字节字符串解码为Unicode)，则可以使用规范的替换规范化数据:

If you actually have Unicode (or decode your byte string to Unicode) then you can normalize the data with a canonical replacement:

>>> s = u'３６fsdfdsf１４'
>>> s
u'\uff13\uff16fsdfdsf\uff11\uff14'
>>> import unicodedata as ud
>>> ud.normalize('NFKC',s)
u'36fsdfdsf14'

如果规范化规范对您来说变化太大，则可以制作仅包含所需替换项的转换表:

If canonical normalization changes too much for you, you can make a translation table of just the replacements you want:

#coding:utf8

repl = u'0123456789'

# Fullwidth digits are U+FF10 to U+FF19.
# This makes a lookup table from Unicode ordinal to the ASCII character equivalent.
xlat = dict(zip(range(0xff10,0xff1a),repl))

s = u'３６fsdfdsf１４'

print(s.translate(xlat))

输出:

36fsdfdsf14

这篇关于将全角Unicode字符转换为ASCII字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将全角Unicode字符转换为ASCII字符 [英] Convert full-width Unicode characters into ASCII characters

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将全角Unicode字符转换为ASCII字符 [英] Convert full-width Unicode characters into ASCII characters

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭