如何匹配Unicode元音? [英] How to match Unicode vowels?

查看:113
本文介绍了如何匹配Unicode元音?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么字符类或Unicode属性将与Perl中的任何 Unicode 元音匹配?

What character class or Unicode property will match any Unicode vowel in Perl?

错误答案:[aeiouAEIOU]. (在此处进行布道,衣物清单中的第24条)

Wrong answer: [aeiouAEIOU]. (sermon here, item #24 in the laundry list)

perluniprops 仅提及了韩文和印度文的元音.

perluniprops mentions vowels only for Hangul and Indic scripts.

让我们搁置一个问题,什么是元音.是的,在某些情况下i可能不是元音.因此,可以是元音的任何字符都可以.

Let's set aside the question what a vowel is. Yes, i may not be a vowel in some contexts. So, any character that can be a vowel will do.

推荐答案

没有此类属性.

$ uniprops --all a
U+0061 <a> \N{LATIN SMALL LETTER A}
    \w \pL \p{LC} \p{L_} \p{L&} \p{Ll}
    AHex POSIX_XDigit All Alnum X_POSIX_Alnum Alpha X_POSIX_Alpha Alphabetic Any ASCII
       ASCII_Hex_Digit Assigned Basic_Latin ID_Continue Is_IDC Cased Cased_Letter LC
       Changes_When_Casemapped CWCM Changes_When_Titlecased CWT Changes_When_Uppercased CWU Ll L
       Gr_Base Grapheme_Base Graph X_POSIX_Graph GrBase Hex X_POSIX_XDigit Hex_Digit IDC ID_Start
       IDS Letter L_ Latin Latn Lowercase_Letter Lower X_POSIX_Lower Lowercase PerlWord POSIX_Word
       POSIX_Alnum POSIX_Alpha POSIX_Graph POSIX_Lower POSIX_Print Print X_POSIX_Print Unicode Word
       X_POSIX_Word XDigit XID_Continue XIDC XID_Start XIDS
    Age=1.1 Age=V1_1 Block=Basic_Latin Bidi_Class=L Bidi_Class=Left_To_Right BC=L
       Bidi_Paired_Bracket_Type=None Block=ASCII BLK=ASCII Canonical_Combining_Class=0
       Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR
       Decomposition_Type=None DT=None East_Asian_Width=Na East_Asian_Width=Narrow EA=Na
       Grapheme_Cluster_Break=Other GCB=XX Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA
       Hangul_Syllable_Type=Not_Applicable HST=NA Indic_Positional_Category=NA InPC=NA
       Indic_Syllabic_Category=Other InSC=Other Joining_Group=No_Joining_Group JG=NoJoiningGroup
       Joining_Type=Non_Joining JT=U Joining_Type=U Script=Latin Line_Break=AL
       Line_Break=Alphabetic LB=AL Numeric_Type=None NT=None Numeric_Value=NaN NV=NaN
       Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0
       Present_In=3.1 IN=3.1 Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1
       Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 Present_In=6.0 IN=6.0
       Present_In=6.1 IN=6.1 Present_In=6.2 IN=6.2 Present_In=6.3 IN=6.3 Present_In=7.0 IN=7.0
       Present_In=8.0 IN=8.0 SC=Latn Script=Latn Script_Extensions=Latin Scx=Latn
       Script_Extensions=Latn Sentence_Break=LO Sentence_Break=Lower SB=LO Word_Break=ALetter WB=LE
       Word_Break=LE

处理i18n时,最重要的事情是思考您实际需要的内容,但是您甚至没有提到要完成的任务.

The most important thing when dealing with i18n is to think about what you actually need, yet you didn't even mention what you are trying to accomplish.

找到元音?那根本不是你真正想做的.我可以看到一个用于识别单词中元音 的用法,但是这些发音通常是由多个字母组成的(例如英语中的"oo","in","an"/"en", "ou","ai","au"/"eau","eu"(法语),并且它是特定于语言的.

Find vowels? That can't be what you are actually trying to do. I could see a use for identifying vowel sounds in a word, but those are often formed from multiple letters (such as "oo" in English, and "in", "an"/"en", "ou", "ai", "au"/"eau", "eu" in French), and it would be language-specific.

就目前而言,您正在寻求一个整体解决方案,但您正在用本地术语来定义问题.首先,您需要先定义要解决的实际问题.

As it stands, you're asking for a global solution but you're defining the problem in local terms. You first need to start by defining the actual problem you are trying to solve.

这篇关于如何匹配Unicode元音?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆