比较MySql中两个几乎相同的行/表之间的文本差异 [英] Compare text differences between two almost identical rows / tables in MySql

查看:119
本文介绍了比较MySql中两个几乎相同的行/表之间的文本差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个具有不同数据集的相同表,现在我想将单个字段中的单词与表b中同一列的多行进行比较,并让我知道与每个id匹配的百分比

I have 2 identical tables with different set of data's, now I would like to compare the words in a single field against multiple rows of the same column in table b and let me know the percentage of matches against each id

示例:

以下是表A中的条目

Row1: 1, salt water masala
Row2: 2, water onion maggi milk

以下是表B中的条目

Row1: 1, salt masala water 
Row2: 2, water onion maggi

所需结果

Row1: Match 100% (All the 3 words are available but different order)
Row2: Match 75%  as 1 word does not match out of the 4 words.

如果有人可以帮助我,那真是太好了。

It would be really great if someone could help me with the same.

推荐答案

尽管在应用程序代码中完成此操作会更容易,但是可以通过以下几个MySQL函数来实现:

Although it would be easier to accomplish this in your application code, it is possible via a couple of MySQL functions:

delimiter //

drop function if exists string_splitter //
create function string_splitter(
  str text,
  delim varchar(25),
  pos tinyint) returns text
begin
return replace(substring_index(str, delim, pos), concat(substring_index(str, delim, pos - 1), delim), '');
end //

drop function if exists percentage_of_matches //

create function percentage_of_matches(
  str1 text,
  str2 text)returns double
begin
set str1 = trim(str1);
set str2 = trim(str2);
while instr(str1, '  ') do
  set str1 = replace(str1, '  ', ' ');
end while;
while instr(str2, '  ') do
  set str2 = replace(str2, '  ', ' ');
end while;
set @i = 1;
set @numWords = 1 + length(str1) - length(replace(str1, ' ', ''));
set @numMatches = 0;
while @i <= @numWords do
  set @word = string_splitter(str1, ' ', @i);
  if str2 = @word or str2 like concat(@word, ' %') or str2 like concat('% ', @word) or str2 like concat('% ', @word, ' %') then
    set @numMatches = @numMatches + 1;
  end if;
  set @i = @i + 1;
end while;
return (@numMatches / @numWords) * 100;
end //

delimiter ;

第一个函数用于第二个函数,即您要在代码中调用的函数,像这样:

The first function is used in the second, which is the one you want to call in your code, like so:

select percentage_of_matches('salt water masala', 'salt masala water');
select percentage_of_matches('water onion maggi milk', 'water onion maggi');

这篇关于比较MySql中两个几乎相同的行/表之间的文本差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆