Python的“此Unicode的最佳ASCII"在哪里?数据库? [英] Where is Python's "best ASCII for this Unicode" database?

查看:61
本文介绍了Python的“此Unicode的最佳ASCII"在哪里?数据库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些使用Unicode标点符号的文本,例如左双引号,撇号的右单引号等等,我需要用ASCII. Python是否有一个包含这些字符且带有明显ASCII替代词的数据库,所以我比将它们全部都变成"更好. ?

I have some text that uses Unicode punctuation, like left double quote, right single quote for apostrophe, and so on, and I need it in ASCII. Does Python have a database of these characters with obvious ASCII substitutes so I can do better than turning them all into "?" ?

推荐答案

Unidecode 看起来像完整的解决方案.它将花式引号转换为ascii引号,将带重音的拉丁字符转换为不带重音,甚至尝试音译以处理不具有ASCII等效项的字符.这样,您的用户就不必看到很多?当您不得不通过传统的7位ascii系统传递他们的文本时.

Unidecode looks like a complete solution. It converts fancy quotes to ascii quotes, accented latin characters to unaccented and even attempts transliteration to deal with characters that don't have ASCII equivalents. That way your users don't have to see a bunch of ? when you had to pass their text through a legacy 7-bit ascii system.

>>> from unidecode import unidecode
>>> print unidecode(u"\u5317\u4EB0")
Bei Jing 

http://www.tablix.org/~avian/blog/archives/2009/01/unicode_transliteration_in_python/

这篇关于Python的“此Unicode的最佳ASCII"在哪里?数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆