查找在Unicode字形上相似的字符? [英] Find characters that are similar glyphically in Unicode?

查看：123 发布时间：2020/6/16 19:06:39 unicode glyph

本文介绍了查找在Unicode字形上相似的字符?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

让我们说我有个字符Ú，Ù，Ü.它们在字形上都与英语U相似.

Lets say I have the characters Ú, Ù, Ü. All of them are similar glyphically to the English U.

是否有一些列表或算法可以做到这一点?

Is there some list or algorithm to do this:

给出Ú或Ù或Ü，则返回英语U
给出一个英语U，返回所有U相似字符的列表

我不确定所有字体中Unicode字符的代码点是否相同? 如果是这样，我想可以有一些简单有效的方法来做到这一点?

I'm not sure if the code point of the Unicode characters is the same across all fonts? If it is, I suppose there could be some easy way and efficient to do this?

更新

如果您使用的是Ruby，则有 unicode-conusable 可用的宝石.在某些情况下可能会有所帮助.

If you're using Ruby, there is a gem available unicode-confusable for this that may help in some cases.

推荐答案

这不适用于所有条件，但是摆脱大多数重音符号的一种方法是将字符转换为分解后的形式，然后放弃合并口音:

This won't work for all conditions, but one way to get rid of most accents is to convert the characters to their decomposed form, then throw away the combining accents:

# coding: utf8
import unicodedata as ud
s=u'U, Ù, Ú, Û, Ü, Ũ, Ū, Ŭ, Ů, Ű, Ų, Ư, Ǔ, Ǖ, Ǘ, Ǚ, Ǜ, Ụ, Ủ, Ứ, Ừ, Ử, Ữ, Ự'
print ud.normalize('NFD',s).encode('ascii','ignore')

输出

U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U, U

要查找重音字符，请使用类似以下内容的

To find accent characters, use something like:

import unicodedata as ud
import string

def asc(unichr):
    return ud.normalize('NFD',unichr).encode('ascii','ignore')

U = u''.join(unichr(i) for i in xrange(65536))
for c in string.letters:
    print u''.join(u for u in U if asc(u) == c)

输出

aàáâãäåāăąǎǟǡǻȁȃȧḁạảấầẩẫậắằẳẵặ
bḃḅḇ
cçćĉċčḉ
dďḋḍḏḑḓ
eèéêëēĕėęěȅȇȩḕḗḙḛḝẹẻẽếềểễệ
fḟ
 :
etc.

这篇关于查找在Unicode字形上相似的字符?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

查找在Unicode字形上相似的字符? [英] Find characters that are similar glyphically in Unicode?

问题描述

推荐答案

输出

输出

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

查找在Unicode字形上相似的字符? [英] Find characters that are similar glyphically in Unicode?

问题描述

推荐答案

输出

输出

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭