SQL-选择最相似的产品 [英] SQL- Selecting the most similar product
问题描述
好的,我有一个关系,该关系存储两个键,一个产品ID和一个属性ID.我想弄清楚哪种产品与给定的产品最相似. (属性实际上是数字,但这会使示例更加混乱,因此已将其更改为字母以简化视觉表示.)
Alright, I have a relation which stores two keys, a product Id and an attribute Id. I want to figure out which product is most similar to a given product. (Attributes are actually numbers but it makes the example more confusing so they have been changed to letters to simplify the visual representation.)
Prod_att
Product | Attributes
1 | A
1 | B
1 | C
2 | A
2 | B
2 | D
3 | A
3 | E
4 | A
最初,这似乎很简单,只需选择产品具有的属性,然后计算每个产品共享的属性数.然后将其结果与一个产品具有的属性数量进行比较,我可以看到两个产品的相似程度.这适用于相对于其比较产品具有大量属性的产品,但是当产品具有很少的属性时会出现问题.例如,产品3几乎与其他所有产品都息息相关(因为A很常见).
Initially this seems fairly simple, just select the attributes that a product has and then count the number of attributes per product that are shared. The result of this is then compared to the number of attributes a product has and I can see how similar two products are. This works for products with a large number of attributes relative to their compared products, but issues arise when products have very few attributes. For example product 3 will have a tie for almost every other product (as A is very common).
SELECT Product, count(Attributes)
FROM Prod_att
WHERE Attributes IN
(SELECT Attributes
FROM prod_att
WHERE Product = 1)
GROUP BY Product
;
关于如何解决此问题或对当前查询进行改进的任何建议?
谢谢!
Any suggestions on how to fix this or improvements to my current query?
Thanks!
* edit:产品4将为所有产品返回count()= 1.我想展示产品3更相似,因为它具有更少的不同属性.
*edit: Product 4 will return count() =1 for all Products. I would like to show Product 3 is more similar as it has fewer differing attributes.
推荐答案
尝试一下
SELECT
a_product_id,
COALESCE( b_product_id, 'no_matchs_found' ) AS closest_product_match
FROM (
SELECT
*,
@row_num := IF(@prev_value=A_product_id,@row_num+1,1) AS row_num,
@prev_value := a_product_id
FROM
(SELECT @prev_value := 0) r
JOIN (
SELECT
a.product_id as a_product_id,
b.product_id as b_product_id,
count( distinct b.Attributes ),
count( distinct b2.Attributes ) as total_products
FROM
products a
LEFT JOIN products b ON ( a.Attributes = b.Attributes AND a.product_id <> b.product_id )
LEFT JOIN products b2 ON ( b2.product_id = b.product_id )
/*WHERE */
/* a.product_id = 3 */
GROUP BY
a.product_id,
b.product_id
ORDER BY
1, 3 desc, 4
) t
) t2
WHERE
row_num = 1
上面的query
获取所有产品的closest matches
,您可以在最里面的查询中包含product_id
,要获取特定product_id
的结果,我使用了LEFT JOIN
,因此即使如果product
没有匹配项,则显示它
The above query
gets the closest matches
for all the products, you can include the product_id
in the innermost query, to get the results for a particular product_id
, I have used LEFT JOIN
so that even if a product
has no matches, its displayed
希望这会有所帮助
这篇关于SQL-选择最相似的产品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!