如何在Perl中将带有重音符号,变音符号等的字母转换为与ASCII相对应的字母? [英] How to convert letters with accents, umlauts, etc to their ASCII counterparts in Perl?

查看:103
本文介绍了如何在Perl中将带有重音符号,变音符号等的字母转换为与ASCII相对应的字母?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个程序,可以在Perl中使用文档,并且很多文档具有诸如ä, ö, ü, é, etc(大写和小写)的字符.我想用ASCII对应的a, o, u, e, etc代替它们.我将如何在Perl中做到这一点?

I'm writing a program that works with documents in Perl and a lot of the documents have characters such as ä, ö, ü, é, etc (both capital and lowercase). I'd like to replace them with ASCII counterparts a, o, u, e, etc. How would I do it in Perl?

我想到的解决方案之一是使用键作为变音符和重音字符的哈希,值是ASCII对应字符,但是这要求我列出所有变音符和重音字符的列表,而我没有没有,如果我建立了清单,我肯定会想念很多,因为我不熟悉所有可能带有变音符号,重音符号和其他变音符号的字符.

One of the solutions I thought of is to have a hash with keys being the umlaut and accent characters, and the values being ASCII counterparts, but that requires me to have a list of all umlaut and accent characters, which I don't have, and if I built a list, I'd certainly miss many as I'm unfamiliar with all the possible characters that could have umlauts, accents and other diacritics.

推荐答案

像往常一样,如果您想到的不仅仅是一个问题,那么CPAN上已经有解决方案. )在这种情况下,它称为 Text :: Unidecode

As usual, if you think of a problem which most certainly is not yours only, there's already a solution on CPAN. ) In this case it's called Text::Unidecode

use warnings;
use strict;
use utf8;
use Text::Unidecode;
print unidecode('ä, ö, ü, é'); # will print 'a, o, u, e'

这篇关于如何在Perl中将带有重音符号,变音符号等的字母转换为与ASCII相对应的字母?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆