SQL-选择最相似的产品 [英] SQL- Selecting the most similar product

查看:95
本文介绍了SQL-选择最相似的产品的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,我有一个关系,该关系存储两个键,一个产品ID和一个属性ID.我想弄清楚哪种产品与给定的产品最相似. (属性实际上是数字,但这会使示例更加混乱,因此已将其更改为字母以简化视觉表示.)

Alright, I have a relation which stores two keys, a product Id and an attribute Id. I want to figure out which product is most similar to a given product. (Attributes are actually numbers but it makes the example more confusing so they have been changed to letters to simplify the visual representation.)

Prod_att

Product | Attributes  
   1   |    A     
   1   |    B  
   1   |    C  
   2   |    A  
   2   |    B  
   2   |    D  
   3   |    A  
   3   |    E  
   4   |    A  

最初,这似乎很简单,只需选择产品具有的属性,然后计算每个产品共享的属性数.然后将其结果与一个产品具有的属性数量进行比较,我可以看到两个产品的相似程度.这适用于相对于其比较产品具有大量属性的产品,但是当产品具有很少的属性时会出现问题.例如,产品3几乎与其他所有产品都息息相关(因为A很常见).

Initially this seems fairly simple, just select the attributes that a product has and then count the number of attributes per product that are shared. The result of this is then compared to the number of attributes a product has and I can see how similar two products are. This works for products with a large number of attributes relative to their compared products, but issues arise when products have very few attributes. For example product 3 will have a tie for almost every other product (as A is very common).

SELECT Product, count(Attributes)  
FROM Prod_att  
WHERE Attributes IN  
(SELECT Attributes  
FROM prod_att  
WHERE Product = 1)  
GROUP BY Product
;  

关于如何解决此问题或对当前查询进行改进的任何建议?
谢谢!

Any suggestions on how to fix this or improvements to my current query?
Thanks!

* edit:产品4将为所有产品返回count()= 1.我想展示产品3更相似,因为它具有更少的不同属性.

*edit: Product 4 will return count() =1 for all Products. I would like to show Product 3 is more similar as it has fewer differing attributes.

推荐答案

尝试一下

SELECT 
  a_product_id, 
  COALESCE( b_product_id, 'no_matchs_found' ) AS closest_product_match
FROM (
  SELECT 
    *,  
    @row_num := IF(@prev_value=A_product_id,@row_num+1,1) AS row_num,
    @prev_value := a_product_id
  FROM 
    (SELECT @prev_value := 0) r
    JOIN (
        SELECT 
         a.product_id as a_product_id,
         b.product_id as b_product_id,
         count( distinct b.Attributes ),
         count( distinct b2.Attributes ) as total_products
        FROM
          products a
          LEFT JOIN products b ON ( a.Attributes = b.Attributes AND a.product_id <> b.product_id )
          LEFT JOIN products b2 ON ( b2.product_id = b.product_id )
       /*WHERE */
         /*  a.product_id = 3 */
        GROUP BY
         a.product_id,
         b.product_id
        ORDER BY 
          1, 3 desc, 4
  ) t
) t2 
WHERE 
  row_num = 1

上面的query获取所有产品的closest matches,您可以在最里面的查询中包含product_id,要获取特定product_id的结果,我使用了LEFT JOIN,因此即使如果product没有匹配项,则显示它

The above query gets the closest matches for all the products, you can include the product_id in the innermost query, to get the results for a particular product_id, I have used LEFT JOIN so that even if a product has no matches, its displayed

SQLFIDDLE

希望这会有所帮助

这篇关于SQL-选择最相似的产品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆