使用levenshtein距离比率比较2条记录 [英] using levenshtein distance ratio to compare 2 records
问题描述
我已经使用levenshtein距离和比率源代码创建了mysql用户函数.我正在比较2条记录,并基于75%的匹配度来选择记录.
I've created the mysql user function using the levenshtein distance and ratio source codes. I am comparing 2 records and based on a 75% match I want to select the record.
- 订单进入带有
ITEM
标题 的表 - 对表
itemkey
执行查询,以在名为ITEM
的记录中找到75%的匹配项 - 如果匹配了75%的标题,则会将表
itemkey
中的八位数字分配给表paypal_ipn_orders
paypal_ipn_orders
- Order comes into table
paypal_ipn_orders
with anITEM
title - A query executes against a table
itemkey
to find a 75% match in a record calledITEM
as well - if a 75% title is match it assigns an eight digit number from table
itemkey
to tablepaypal_ipn_orders
这是查询
UPDATE paypal_ipn_orders
SET sort_num = (SELECT sort_id
FROM itemkey
WHERE levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75)
WHERE packing_slip_printed = 0
AND LOWER(payment_status) = 'completed'
AND address_name <> ''
AND shipping < 100
我对此进行了几次调整,但是在levenshtein_ratio部分的第4行和第5行之间失败了.如果成功,则说明子查询返回多个行.我不知道如何解决它,使之返回正确的结果,我只是失去了对如何使这项工作.
I have adjusted this a few times but it's failing between line 4 and 5 at the levenshtein_ratio part. If it works it says that the subquery returns more than one row. I don't know how to fix it to make it return the correct result, I just lost as to how to make this work.
推荐答案
SET上的子查询只能返回一个值.如果itemkey有一个以上项占item_name的75%,您要怎么做?以下将使用最佳匹配之一:
A subquery on a SET should only return one value. If itemkey has more than one item that is 75% of item_name what do you want to do? The below will use one of the best matches:
UPDATE paypal_ipn_orders
SET sort_num = (SELECT sort_id
FROM itemkey
WHERE levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75
ORDER BY levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) DESC
LIMIT 1)
WHERE packing_slip_printed = 0
AND LOWER(payment_status) = 'completed'
AND address_name <> ''
AND shipping < 100
这篇关于使用levenshtein距离比率比较2条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!