音译和模糊搜索,例如Google的建议 [英] Transliteration and fuzzy search, like Google suggestions
问题描述
我需要用字符的音译进行模糊搜索,例如:
I need to do a fuzzy search with transliteration of the characters, for example:
我有一个ASP.NET应用程序数据库,它有一个带有西班牙语单词列表的表(200,000个条目),还有一个带有输入字段的页面.关键是我不懂西班牙语,也不知道如何用西班牙语拼写搜索词,但我知道它的发音.因此,在文本框中输入搜索词,例如"beautiful",但是在记录错误中显示"prekieso",并且我需要从数据库中获取正确的版本:"precioso".
I have an ASP.NET application, database, which has a table with a list of Spanish words (200,000 entries), I also have a page with an input field. The point is that I do not know Spanish, and I do not know how to spell a search word in Spanish, but I know how it sounds. Therefore, in the text box I enter the search word, such as "beautiful", but in the recording err - "prekieso", and I need to get from the database got the correct version: "precioso".
如何实现?换句话说,我需要类似于Google的建议...
How can this be implemented? In other words, I need something similar to Google suggestions...
推荐答案
存储过程/函数,算法计算Levenshtein的距离:
The stored procedure / function, the algorithm calculates the distance Levenshtein:
USE [**dbname**]
GO
/****** Object: UserDefinedFunction [dbo].[levenshtein] Script Date: 05/27/2013 17:54:05 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[levenshtein](@left varchar(100), @right varchar(100))
returns int
as
BEGIN
DECLARE @difference int, @lenRight int, @lenLeft int, @leftIndex int, @rightIndex int, @left_char char(1), @right_char char(1), @compareLength int
SET @lenLeft = LEN(@left)
SET @lenRight = LEN(@right)
SET @difference = 0
If @lenLeft = 0
BEGIN
SET @difference = @lenRight GOTO done
END
If @lenRight = 0
BEGIN
SET @difference = @lenLeft
GOTO done
END
GOTO comparison
comparison:
IF (@lenLeft >= @lenRight)
SET @compareLength = @lenLeft
Else
SET @compareLength = @lenRight
SET @rightIndex = 1
SET @leftIndex = 1
WHILE @leftIndex <= @compareLength
BEGIN
SET @left_char = substring(@left, @leftIndex, 1)
SET @right_char = substring(@right, @rightIndex, 1)
IF @left_char <> @right_char
BEGIN -- Would an insertion make them re-align?
IF(@left_char = substring(@right, @rightIndex+1, 1))
SET @rightIndex = @rightIndex + 1
-- Would an deletion make them re-align?
ELSE
IF(substring(@left, @leftIndex+1, 1) = @right_char)
SET @leftIndex = @leftIndex + 1
SET @difference = @difference + 1
END
SET @leftIndex = @leftIndex + 1
SET @rightIndex = @rightIndex + 1
END
GOTO done
done:
RETURN @difference
END
调用:
select
dbo.edit_distance('Fuzzy String Match','fuzzy string match'),
dbo.edit_distance('fuzzy','fuzy'),
dbo.edit_distance('Fuzzy String Match','fuzy string match'),
dbo.edit_distance('levenshtein distance sql','levenshtein sql server'),
dbo.edit_distance('distance','server')
或:
SELECT [Name]
FR OM [tempdb].[dbo].[Names]
WHERE dbo.edit_distance([Name],'bozhestvennia') <= 3
这篇关于音译和模糊搜索,例如Google的建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!