移除Python unicode字符串中的重音符号（规范化）的最佳方法是什么？ [英] What is the best way to remove accents (normalize) in a Python unicode string?

查看：77 发布时间：2020/10/21 20:19:52 python python-3.x unicode python-2.x diacritics

本文介绍了移除Python unicode字符串中的重音符号（规范化）的最佳方法是什么？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Python中有一个Unicode字符串，我想删除所有重音符号（变音符号）。

I have a Unicode string in Python, and I would like to remove all the accents (diacritics).

我在网络上发现了一种优雅的方式（在Java中）：

I found on the web an elegant way to do this (in Java):

将Unicode字符串转换为其 长规格化格式 （用单独的字母和变音符号）

删除所有Unicode类型为变音符号的字符。

我是否需要安装pyICU之类的库，还是仅使用Python标准库就能做到？那么python 3呢？

Do I need to install a library such as pyICU or is this possible with just the Python standard library? And what about python 3?

重要说明：我想避免使用带有重音符号到非重音符号的显式映射的代码。

Important note: I would like to avoid code with an explicit mapping from accented characters to their non-accented counterpart.

推荐答案

Unidecode 是正确的答案这个。它将所有unicode字符串音译为ASCII文本中最接近的可能表示形式。

Unidecode is the correct answer for this. It transliterates any unicode string into the closest possible representation in ascii text.

示例：

accented_string = u'Málaga'
# accented_string is of type 'unicode'
import unidecode
unaccented_string = unidecode.unidecode(accented_string)
# unaccented_string contains 'Malaga'and is of type 'str'

这篇关于移除Python unicode字符串中的重音符号（规范化）的最佳方法是什么？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

移除Python unicode字符串中的重音符号（规范化）的最佳方法是什么？ [英] What is the best way to remove accents (normalize) in a Python unicode string?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

移除Python unicode字符串中的重音符号（规范化）的最佳方法是什么？ [英] What is the best way to remove accents (normalize) in a Python unicode string?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭