MySQL查询以查找最相似的数值行 [英] MySQL Query to find most similar numerical row

查看:558
本文介绍了MySQL查询以查找最相似的数值行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在MySQL数据库中,我试图跨多个数值属性找到最相似的行.此问题类似于此问题,但其中包含一个比较灵活的数量和联接表.

In a MySQL database, I am attempting to find the most similar row across a number of numerical attributes. This problem is similar to this question but includes a flexible number of comparisons and a join table.

数据库由两个表组成.第一张表,用户,是我要比较的.

The database consists of two tables. The first table, users, is what I'm trying to compare.

id | self_ranking
----------------------------------
1       | 9
2       | 3
3       | 2

第二张表是用户对特定项目给予的一系列评分.

The second table is a series of scores which the user gave to particular items.

id | user_id | item_id | score
----------------------------------
1  | 1       | 1       | 4
2  | 1       | 2       | 5
3  | 1       | 3       | 8
4  | 1       | 4       | 3

任务

我想找到与给定用户最相似"的用户,对所有排名项目进行均等的评估(以及自我得分).因此,完美的匹配将是以完全相同的方式对所有相同项目进行排名的用户.对自己的评分相同,而下一个最佳选择是一项的排名略有不同.

Task

I want to find the "most similar" user to a given one, valuing all the ranked items equally (along with the self score). Thus, a perfect match would be the user who has ranked all the same items in exactly the same manner & has rated himself the same, while the next most optimal choice would be one whose ranking of one item differs slightly.

我遇到了困难:

  • 高效地联接两个表
  • 处理并非所有用户都对相同项目进行排名的事实.我们只想比较相同项目的排名.

有人可以帮助我构建一个合理的查询吗?我对MySQL的了解不是很强,所以很抱歉,如果这个答案很明显.

Could someone help me construct a reasonable query? I'm not terribly strong with MySQL, so sorry if this answer should be obvious.

如果用户4对自己的排名为8,而项目1 => 4和2 => 5,那么我希望查询用户4的最接近用户返回1,即最接近的用户的user_id.

If user 4 has ranked himself 8 and items 1=>4 and 2=>5, then I'd like to have the query for user 4's closest user to return 1, the user_id of the closest user.

推荐答案

SELECT   u2.user_id

-- join our user to their scores
FROM     (users u1 JOIN scores s1 USING (user_id))

-- and then join other users and their scores
    JOIN (users u2 JOIN scores s2 USING (user_id))
      ON s1.item_id  = s2.item_id
     AND u1.user_id != u2.user_id

-- filter for our user of interest
WHERE    u1.user_id = ?

-- group other users' scores together
GROUP BY u2.user_id

-- and here's the magic: order in descending order of "distance" between
-- our selected user and all of the others: you may wish to weight
-- self_ranking differently to item scores, in which case just multiply
-- appropriately
ORDER BY SUM(ABS(s2.score - s1.score))
       + ABS(u2.self_ranking - u1.self_ranking) DESC

这篇关于MySQL查询以查找最相似的数值行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆