正则表达式:比较两个字符串以找到“头韵"和“共鸣" [英] RegEx: Compare two strings to find Alliteration and Assonance

查看:39
本文介绍了正则表达式:比较两个字符串以找到“头韵"和“共鸣"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以比较两个字符串以找到Alliteration和Assonance?

我主要使用javascript或php

解决方案

我不确定正则表达式是否是构建健壮的比较工具的最佳方法.一个简单的正则表达式可能是较大解决方案的一部分,该解决方案使用更复杂的算法进行非精确匹配.

英语有多种现成的选项,其中一些可以相当简单地扩展到使用 Soundex 算法已有近100年的历史,并已以多种编程语言实现.它用于根据字符串的发音确定数字值.它不是精确的,但对于识别相似的发音字/音节可能很有用.我已经在MS SQL Server中对其进行了实验,并且可以在PHP中使用它.

http://php.net/manual/en/function.soundex.php

(包括PHP文档)的普遍共识是,在处理英语时,Metaphone比Soundex准确得多.有许多可用的实现(Wikipedia在文章末尾列出了很长的列表),并且包含在PHP中.

http://www.php.net/manual/zh/function.metaphone.php

Double Metahpone支持与单词的替代发音相对应的单词的第二种编码.

与Metaphone一样,Double Metaphone已以多种编程语言实现(示例).

单词解构

Levenshtein可以用于建议其他拼写(例如,使用户输入标准化),并且可以用作更细致的分配和共鸣算法的一部分.

http://www.php.net/manual/zh/function.levenshtein.php

从逻辑上讲,这将有助于理解字符串中单词的音节,以便可以解构每个单词.音节中断可以解决关于如何发音两个相邻字母的模棱两可的问题.该线程有一些链接:

PHP音节检测

would be possible to Compare two strings to find Alliteration and Assonance?

i use mainly javascript or php

解决方案

I'm not sure that a regex would be the best way of building a robust comparison tool. A simple regex might be part of a larger solution that used more sophisticated algorithms for non-exact matching.

There are a variety of readily-available options for English, some of which could be extended fairly simply to languages that use the Latin alphabet. Most of these algorithms have been around for years or even decades and are well-documented, though they all have limits.

I imagine that there are similar algorithms for non-Latin alphabets but I can't comment on their availability firsthand.

Phonetic Algorithms

The Soundex algorithm is nearly 100 years old and has been implemented in multiple programming languages. It is used to determine a numeric value based on the pronunciation of a string. It is not precise but it may be useful for identifying similar sounding words/syllables. I've experimented with it in MS SQL Server and it is available in PHP.

http://php.net/manual/en/function.soundex.php

General consensus (including the PHP docs) is that Metaphone is much more accurate than Soundex when dealing with the English language. There are numerous implementations available (Wikipedia has a long list at the end of the article) and it is included in PHP.

http://www.php.net/manual/en/function.metaphone.php

Double Metahpone supports a second encoding of a word corresponding to an alternate pronunciation of the word.

As with Metaphone, Double Metaphone has been implemented in many programming languages (example).

Word Deconstruction

Levenshtein can be used to suggest alternate spellings (for example, to normalize user input) and might be useful as part of a more granular algorithm for alliteration and assonance.

http://www.php.net/manual/en/function.levenshtein.php

Logically, it would help to understand the syllabication of the words in the string so that each word could be deconstructed. The syllable break could resolve ambiguity as to how two adjacent letters should be pronounced. This thread has a few links:

PHP Syllable Detection

这篇关于正则表达式:比较两个字符串以找到“头韵"和“共鸣"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆