在VBA中查找相似的发音文本 [英] Finding similar sounding text in VBA
问题描述
我的经理告诉我,有一种方法可以评估拼写不同但听起来相似的名称.理想情况下,我们希望能够评估用户输入的搜索名称并返回完全匹配的名称以及相似的发音"名称.他称此过程为"Soundits",但我无法在Google上找到任何信息.
My manager tells me that there is a way to evaluate names that are spelled differently but sound similar in the way they are pronounced. Ideally, we want to be able to evaluate a user-entered search name and return exact matches as well as "similar sounding" names. He called the process "Soundits" but I cannot find any info on Google.
这存在吗?有谁知道它是否可用于VBA(访问)?
Does this exist? Does anyone know if it is available for VBA (Access)?
推荐答案
很好的问题!您要提出的问题包括这个想法本身的一个很好的例子.
Nice question! You're question includes a great example of the idea itself.
有一种称为Russell Soundex 算法的算法,这是许多应用程序中的标准技术,该算法通过语音而不是实际拼写来评估姓名.在这个问题中, Soundits 和 Soundex 是相似的名字!
There is an algorithm called the Russell Soundex algorithm, a standard technique in many applications, that evaluates names by the phonetic rather than the actual spelling. In this question, Soundits and Soundex are similar sounding names!
关于Soundex:
Soundex算法基于英语的特征,例如:
The Soundex algorithm is predicated on characteristics of English such as:
- 第一个字母具有重要意义
- 许多辅音听起来很相似
- 辅音对发音的影响大于元音
一个警告:Soundex是为名称而设计的.越短越好.随着名称的增加,Soundex的可靠性降低.
One warning: Soundex was designed for names. The shorter the better. As a name grows longer, the Soundex becomes less reliable.
资源:
- 以下是使用VBA进行访问的示例.
- Ken Getz和Mike Gilbert在 VBA开发人员手册第二版中有关于Soundex的文章.
- 关于Soundex和其他变体(例如Soundex2)(搜索"Soundex"和"VBA")的信息很多.
- Here is an example that uses VBA for Access.
- There is a write-up on Soundex in the VBA Developer's Handbook, 2nd Edition by Ken Getz and Mike Gilbert.
- There is a lot of information about Soundex and other variants such as Soundex2 (Search for 'Soundex' and 'VBA').
代码示例:
下面是通过快速Web搜索找到的一些VBA代码,该代码实现了Soundex算法的一种变体.
Below is some VBA code, found via a quick web search, that implements a variation of the Soundex algorithm.
Option Compare Database
Option Explicit
Public Function Soundex(varText As Variant) As Variant
On Error GoTo Err_Handler
Dim strSource As String
Dim strOut As String
Dim strValue As String
Dim strPriorValue As String
Dim lngPos As Long
If Not IsError(varText) Then
strSource = Trim$(Nz(varText, vbNullString))
If strSource <> vbNullString Then
strOut = Left$(strSource, 1&)
strPriorValue = SoundexValue(strOut)
lngPos = 2&
Do
strValue = SoundexValue(Mid$(strSource, lngPos, 1&))
If ((strValue <> strPriorValue) And (strValue <> vbNullString)) Or (strValue = "0") Then
strOut = strOut & strValue
strPriorValue = strValue
End If
lngPos = lngPos + 1&
Loop Until Len(strOut) >= 4&
End If
End If
If strOut <> vbNullString Then
Soundex = strOut
Else
Soundex = Null
End If
Exit_Handler:
Exit Function
Err_Handler:
MsgBox "Error " & Err.Number & ": " & Err.Description, vbExclamation, "Soundex()"
Resume Exit_Handler
End Function
Private Function SoundexValue(strChar As String) As String
Select Case strChar
Case "B", "F", "P", "V"
SoundexValue = "1"
Case "C", "G", "J", "K", "Q", "S", "X", "Z"
SoundexValue = "2"
Case "D", "T"
SoundexValue = "3"
Case "L"
SoundexValue = "4"
Case "M", "N"
SoundexValue = "5"
Case "R"
SoundexValue = "6"
Case vbNullString
SoundexValue = "0"
Case Else
'Return nothing for "A", "E", "H", "I", "O", "U", "W", "Y", non-alpha.
End Select
End Function
Levenshtein距离
比较字符串的另一种方法是获取 Levenshtein距离.这是VBA中给出的示例,该示例摘自 LessThanDot Wiki :>
Another method of comparing strings is to get the Levenshtein distance. Here is the example given in VBA, it is taken from LessThanDot Wiki:
Function LevenshteinDistance(word1, word2)
Dim s As Variant
Dim t As Variant
Dim d As Variant
Dim m, n
Dim i, j, k
Dim a(2), r
Dim cost
m = Len(word1)
n = Len(word2)
''This is the only way to use
''variables to dimension an array
ReDim s(m)
ReDim t(n)
ReDim d(m, n)
For i = 1 To m
s(i) = Mid(word1, i, 1)
Next
For i = 1 To n
t(i) = Mid(word2, i, 1)
Next
For i = 0 To m
d(i, 0) = i
Next
For j = 0 To n
d(0, j) = j
Next
For i = 1 To m
For j = 1 To n
If s(i) = t(j) Then
cost = 0
Else
cost = 1
End If
a(0) = d(i - 1, j) + 1 '' deletion
a(1) = d(i, j - 1) + 1 '' insertion
a(2) = d(i - 1, j - 1) + cost '' substitution
r = a(0)
For k = 1 To UBound(a)
If a(k) < r Then r = a(k)
Next
d(i, j) = r
Next
Next
LevenshteinDistance = d(m, n)
End Function
这篇关于在VBA中查找相似的发音文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!