SQL模式匹配 [英] SQL pattern matching

查看:101
本文介绍了SQL模式匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与SQL有关的问题.

I have a question related to SQL.

我想匹配两个字段以求相似度,并返回相似度的百分比.

I want to match two fields for similarities and return a percentage on how similar it is.

例如,如果我有一个名为doc的字段,其中包含以下内容

For example if I have a field called doc, which contains the following

This is my first assignment in SQL 

在另一个字段中,我有类似的东西

and in another field I have something like

My first assignment in SQL 

我想知道如何检查两者之间的相似性并返回多少百分比.

I want to know how I can check the similarities between the two and return by how much percent.

我做了一些研究,想要第二意见,而且我从未要求提供源代码. Ive使用Levenshtein距离算法研究了Soundex(),difference(),模糊字符串匹配.

I did some research and wanted a second opinion plus I never asked for source code. Ive looked at Soundex(), Difference(), Fuzzy string matching using Levenshtein distance algorithm.

推荐答案

您没有说您使用的是哪个版本的Oracle.本示例基于11g版本. 您可以使用 utl_match edit_distance函数.包来确定您需要更改多少个字符才能将一个字符串转换为另一个字符串. greatst 函数返回列表中的最大值传入的参数.这是一个示例:

You didn't say what version of Oracle you are using. This example is based on 11g version. You can use edit_distance function of utl_match package to determine how many characters you need to change in order to turn one string to another. greatest function returns the greatest value in the list of passed in parameters. Here is an example:

-- sample of data 
with t1(col1, col2) as(
  select 'This is my first assignment in SQL', 'My first assignment in SQL ' from dual
)
-- the query
select trunc(((greatest(length(col1), length(col2)) -  
              (utl_match.edit_distance(col2, col1))) * 100) / 
             greatest(length(col1), length(col2)), 2) as "%"
  from t1

结果:

         %
----------
     70.58

附录

正如@jonearles正确指出的那样,使用utl_match包的edit_distance_similarity功能要简单得多.

As @jonearles correctly pointed out, it is much simpler to use edit_distance_similarity function of utl_match package.

 with t1(col1, col2) as(
     select 'This is my first assignment in SQL', 'My first assignment in SQL ' from dual
  )
  select utl_match.edit_distance_similarity(col1, col2) as "%"
    from t1
   ;

结果:

         %
----------
        71

这篇关于SQL模式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆