如何选择所有带有特定标签的帖子? [英] How can I select all posts which have specific tags?
问题描述
这是我的表结构:
// posts
+----+-----------+---------------------+-------------+
| id | title | body | keywords |
+----+-----------+---------------------+-------------+
| 1 | title1 | Something here | php,oop |
| 2 | title2 | Something else | html,css,js |
+----+-----------+---------------------+-------------+
// tags
+----+----------+
| id | name |
+----+----------+
| 1 | php |
| 2 | oop |
| 3 | html |
| 4 | css |
| 5 | js |
+----+----------+
// pivot
+---------+--------+
| post_id | tag_id |
+---------+--------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 2 | 4 |
| 2 | 5 |
+---------+--------+
好的,我有两个标签(php
和html
),我需要选择所有带有标签的帖子.我该怎么办?
Ok well, I have two tags (php
and html
) and I need to select all posts tagged with them. How can I do that?
当前,我使用REGEXP
并只是选择我想要的内容:
Currently I use REGEXP
and simply select what I want like this:
SELECT * FROM posts WHERE keywords REGEXP 'php|html';
看到了吗?我什至不使用1 join
.这些天来,我的数据集已经长大了,我的查询需要一段时间才能执行.我想我必须使用类似join
的关系功能.但是我不确定是否会比我当前的查询更好.
See? I don't use even 1 join
. These days my dataset is grown up and my query takes a while to be executed. I guess I have to use a relational feature like join
. However I'm not sure it would be better than my current query.
无论如何,有人知道,我怎样才能更快地获得预期的结果?
Anyway, does anybody know, how can I get the expected result faster?
推荐答案
正则表达式的处理速度可能很慢.使用LIKE
可能会提供更好的响应时间:
Regular expressions can be slow to process. Using LIKE
will probably give better response times:
SELECT *
FROM posts
WHERE (keywords LIKE '%php%' OR keywords LIKE '%html%')
基于规范化表的查询将是:
The query based on the normalised tables would be:
SELECT posts.id, posts.title, posts.body, posts.keywords
FROM posts
INNER JOIN pivot ON pivot.post_id = posts.id
INNER JOIN tags ON tags.id = pivot.tag_id
WHERE tags.name IN ('html', 'php')
GROUP BY posts.id
为了获得最佳速度,您必须确保将id
字段声明为主键,并且在以下位置具有索引:
For optimal speed you must ensure that the id
fields are declared as primary keys, and that you have indexes on:
tags(name)
pivot(tag_id)
不过,如果所有帖子中有很大一部分满足条件,这将不会比您当前的解决方案快:它可能会变慢.但是,例如,如果少于1%的职位满足条件,那么这可能会更好地执行,因为从原则上讲,执行计划不需要包括对整个职位表的扫描.
Still, this will not be faster than your current solution if a significant part of all the posts fulfil the condition: it could well be slower. But if for example less than 1% of the posts would satisfy the condition, then this will likely perform better, as in principle the execution plan does not need to include a scan of the whole posts table.
这篇关于如何选择所有带有特定标签的帖子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!