实现语音搜索的最有效的方式 [英] The most efficient way to implement a phonetic search

查看:122
本文介绍了实现语音搜索的最有效的方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在C ++和/或Java中实现语音搜索的最有效的方法是什么?通过语音搜索我的意思是取代声音相似的元音或辅音。这对于名字特别有用,因为有时候人们的名字有一些奇怪的拼写。

What is the most efficient way to implement a phonetic search in C++ and/or Java? By phonetic search I mean substituting vowels or consonants that sound similar. This would be especially useful for names because sometimes people's names have sort of strange spellings.

我认为取代元音和一些辅音可能很有效。也可能包括一些特殊情况,如静音E在结束或F和PH。最好在C ++中使用cstrings或字符串?最好使用替换值在内存中存储一​​个副本,或者在每次查找某个东西时调用一个函数。

I am thinking it might be effective to substitue vowels and some consonants. It may also be good to include some special cases like silent E's at the end or F and PH. Would it be best to use cstrings or strings in C++? Would it be better to store a copy in memory with the substituted values or call a function every time we look for something?

推荐答案

Soundex 及其变体是此标准算法。它使用语音规则将名称转换为字母数字代码。具有相同代码的名称分组在一起。

Soundex along with its variants is the standard algorithm for this. It uses phonetic rules to transform the name into an alphanumeric code. Names with the same code are grouped together.

对于实现搜索,我将使用一个数据结构将每个soundex代码映射到该代码。根据所使用的数据结构(哈希表或树),查找可以在不同的soundex码的数量的对数上是恒定的时间进行。

As far as implementing the search, I'd use a data structure that maps each soundex code to the list of names that have that code. Depending on the data structure used (a hash table or a tree), the lookup could be done in time that is either constant on logarithmic in the number of distinct soundex codes.

我不知道 cstring (Microsoft的 CString ?)是什么意思,但标准 std :: string 类将完全适合这个问题,将是我的首选选择。

I am not sure what exactly you mean by cstring (Microsoft's CString?) but the standard std::string class will be perfectly fine for this problem and would be my preferred choice.

这篇关于实现语音搜索的最有效的方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆