比较属性集以找到最佳匹配 [英] compare sets of properties to find best match

查看:67
本文介绍了比较属性集以找到最佳匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎存在与此类似但并非完全相同的问题.我尝试沿着这条路走(比较数据集并返回最佳匹配 ),但发现自己很沮丧.

There seems to be problems similar to this but not quite. I tried going down this path ( compare data sets and return best match ), but found myself stumped.

我需要进行训练并找到最匹配的集合.假设我们的search_obj包含值(1、4、29、44、378、379).我想找到其他具有相似值的对象,理想情况下,找到与之最匹配的对象.将存在大量其他对象,因此性能是一个很大的问题.

I need to take on set and find the best matching set. So say we have search_obj that contains values (1, 4, 29, 44, 378, 379). I would like to find other objects with similar values and ideally find the one that best matches this. There will be a large amount of other objects so performance is a big concern.

我目前正在php和mysql中工作,但是如果这意味着更好的性能,我愿意更改它.

I am currently working in php and mysql but am willing to change that if it means better performance.

谢谢您的帮助.

推荐答案

我刚想到:

假设您有一个唯一对表(a,b):

Suppose you have a table of unique pairs (a, b):

CREATE table t1 (a INT, b INT, PRIMARY KEY (a, b));

现在您将其填充为:

INSERT INTO t1
VALUES (1,1), (1,2),               -- item to compare with
       (2,1), (2,3),               -- has one common prop with 1
       (3,1), (3,2),               -- has the same props as 1
       (4,1), (4,2), (4,3), (4,4); -- has 2 same props with 1

以下查询将根据相似性对其他项目进行排序:

The following query will order the other items according to similarity:

SELECT t1.a,
    COUNT(t2.a) as same_props_count,
    ABS(COUNT(t2.a) - COUNT(*)) as diff_count
FROM t1
LEFT JOIN t1 as t2 ON t1.b = t2.b and t2.a = 1
WHERE t1.a <> 1
GROUP BY t1.a
ORDER BY same_props_count DESC, diff_count;


a, same_props_count, diff_count
3, 2,                0
4, 2,                2
2, 1,                1

这篇关于比较属性集以找到最佳匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆