在MySQL表中获取最相似的行并按相似性对其进行排序 [英] Getting most similar rows in MySQL table and order them by similarity
问题描述
我有一个数据库表,其中包含用户的车辆(汽车,摩托车).我想从那张桌子中拿出最相似的车辆.可以说该表包含以下几列(并带有一些上下文可以理解这一点):
I have a database table that holds user's vehicles (cars, motorcycles). I want to get the most similar vehicles out of that table. Lets say the table holds the following columns (with some context to get the idea):
table: vehicles
vehicle_id (pk, auto-increment)
model_id (BMW 3er, Honda Accord)
fuel_type (gasoline, diesel)
body_style (sedan, coupe)
year
engine_size (2.0L)
engine_power (150hp)
因此,简而言之,我想选择N个(通常是3个)具有相同make_id(至少)的行,并按它们与种子载体共享的相似度对它们进行排名,可以说如果fuel_type
匹配,我就是d的得分为+3,但如果body_style
匹配,则为+1.理想情况下,我将获得N点具有最高分数的车辆,但想法是当我没有时仍能得到一些东西.
So in short I want to select N (usually 3) rows that have the same make_id (at least) and rank them by the amount of similarities they share with the seed vehicle lets say if the fuel_type
matches, I'd have rank points +3, but if the body_style
matches, it would be +1. Ideally I would get N vehicles that have maximum points but the idea is to still get something when I don't.
推荐答案
由于在我的表中目前只有大约5k行,并且它们正在缓慢增长,所以我决定实际使用以下简单方法(在我遇到之后我写了问题).
As in my table currently I have only around 5k rows and they are slowly growing, I decided to actually use the following simple approach (it came to me just after I wrote the question).
种子可以说是本田雅阁(model_id 456),2004,汽油,2.0L,155hp,轿车,带自动档ID 123.
The seed lets say is Honda Accord (model_id 456), 2004, gasoline, 2.0L, 155hp, sedan with auto-inc ID 123.
SELECT vehicles.*,
(IF(`fuel_type`='gasoline', 3, 0) +
IF(`body_style`='sedan', 1, 0) +
IF(`year` > 2001 AND `year` < 2007, 2, 0) +
IF(`engine_size` >= 1.8 AND `engine_size` <= 2.2, 1, 0) +
IF(`engine_power`=155, 3, IF(`engine_power`>124 AND `engine_power`<186, 1, 0))) AS `rank`
FROM vehicles
WHERE vehicle_id!=123 AND model_id=456
ORDER BY `rank` DESC
LIMIT 3
只要我没有太多的行,它将起作用.如果表格变成50-100k,我可能不得不切换到Lucene
?
It will work, as long as I don't too many rows. If the table becomes 50-100k, I probably will have to switch to something like Lucene
?
这篇关于在MySQL表中获取最相似的行并按相似性对其进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!