如何将带有重音符号、变音符号等的字母转换为 Perl 中的 ASCII 对应字母? [英] How to convert letters with accents, umlauts, etc to their ASCII counterparts in Perl?
问题描述
我正在编写一个处理 Perl 文档的程序,很多文档都包含诸如 ä、ö、ü、é 等
(大写和小写)等字符.我想用 ASCII 对应物a, o, u, e, etc
替换它们.我将如何在 Perl 中做到这一点?
I'm writing a program that works with documents in Perl and a lot of the documents have characters such as ä, ö, ü, é, etc
(both capital and lowercase). I'd like to replace them with ASCII counterparts a, o, u, e, etc
. How would I do it in Perl?
我想到的解决方案之一是使用哈希,键是变音和重音字符,值是 ASCII 对应的值,但这需要我拥有所有变音和重音字符的列表,我不这样做'没有,如果我建立一个列表,我肯定会错过很多,因为我不熟悉所有可能带有变音、重音和其他变音符号的字符.
One of the solutions I thought of is to have a hash with keys being the umlaut and accent characters, and the values being ASCII counterparts, but that requires me to have a list of all umlaut and accent characters, which I don't have, and if I built a list, I'd certainly miss many as I'm unfamiliar with all the possible characters that could have umlauts, accents and other diacritics.
推荐答案
像往常一样,如果您想到一个肯定不是您唯一的问题,那么 CPAN 上已经有一个解决方案.) 在这种情况下,它被称为 Text::Unidecode
As usual, if you think of a problem which most certainly is not yours only, there's already a solution on CPAN. ) In this case it's called Text::Unidecode
use warnings;
use strict;
use utf8;
use Text::Unidecode;
print unidecode('ä, ö, ü, é'); # will print 'a, o, u, e'
这篇关于如何将带有重音符号、变音符号等的字母转换为 Perl 中的 ASCII 对应字母?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!