这个(规范化的)数据库结构是否允许我按照我的意图搜索标签? [英] Will this (normalised) database structure permit me to search by tags as I intend?

查看:103
本文介绍了这个(规范化的)数据库结构是否允许我按照我的意图搜索标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设置一个包含以下三个表的标准化MySQL数据库。第一张表包含可以由各种标签描述的项目列表。第三个表包含用于描述第一个表中的项目的各种标签。中间表将另外两个表相互关联。在每个表的情况下,id是一个自动递增主键(每个都用作中间表中的外键)

I am trying to set up a normalised MySQL database containing the three following tables. The first table contains a list of items which can be described by various tags. The third table contains the various tags used to describe the items in the first table. The middle table relates the other two tables to each other. In each table's case, the id is an auto-incrementing primary key (and each is used as the foreign key in the middle table)

+---------------+---------------------+---------------+
|   Table 1     |      Table 2        |   Table 3     |
+---------------+---------------------+---------------+
|id        item |id   item_id   tag_id|id          tag|
+---------------+---------------------+---------------+
| 1      spaniel| 1         1        4| 1         bird|
| 2        tabby| 2         1       23| 4          pet|
| 3      chicken| 3         1       41|23          dog|
| 4     goldfish| 4         2        4|24          cat|
|               | 5         2       24|25      reptile|
|               | 6         3        1|38         fish|
|               | 7         3       40|40    delicious|
|               | 8         4        4|41        cheap|
|               | 9         4       38|42    expensive|
|               |10         4       41|               |
|               |                     |               |
+---------------+---------------------+---------------+

我想运行一个更多的标签的查询所以例如,查询宠物将返回项目(1)的西班牙猎犬,(2)可以返回与所有标签相匹配的项目。

I want to run a query of one ore more tags against the three tables to return the items that match ALL of the tags.

标签和(4)金鱼,因为它们都被标记为宠物。查询便宜和宠物在一起将返回(1)西班牙猎犬和(4)金鱼,因为它们被标记为便宜和宠物。 Tabby不会被返回,因为它只被标记为宠物,但不是便宜(在我的世界中,小猫是昂贵的:P)

So for example, querying for "pet" would return the items (1)spaniel, (2)tabby and (4)goldfish, because all of them are tagged "pet". Querying for "cheap" and "pet" together would return (1)spaniel and (4)goldfish because they are both tagged "cheap" and "pet". Tabby would not be returned as it is only tagged "pet" but not "cheap" (in my world tabby cats are expensive :P)

查询便宜 宠物和狗只会返回(1)西班牙猎犬,因为它是唯一一个匹配所有三个标签的人。

Querying for "cheap", "pet" and "dog" would only return (1)Spaniel, since it is the only one matching all three tags.

无论如何,这是所需的行为。我有两个问题。

Anyway, this is the desired behaviour. I have two questions.



  1. 这是为我的意图设置我的表的最好方式目的?我是
    仍然是新概念化的
    数据库的想法,我正在选择这个,因为我
    一直 - 效率或
    的任何输入,即使这是一个适当的布局
    为我的数据库将是很多
    赞赏。

  1. Is this the best way to set up my tables for my intended purposes? I am still new to ideas of normalising databases, and am picking this up as I go along - any input on efficiency or even if this is an appropriate layout for my database would be much appreciated.

如果上述设置是可行的,那么我如何构建
单个MySQL查询来实现我的
预期目的?对于
系列的标签,只返回匹配所有指定的
标签的
项目)。我尝试过做一个
的JOINs / UNION,但没有一个是
给我想要的效果(通常
返回所有匹配任何
标签的项目)。我花了一些时间
查看MySQL手册
在线,但我觉得我在概念上缺少
的东西。

Provided the above setup is workable, how could I structure a single MySQL query to achieve my intended purpose?* (that being, for a series of tags, returning ONLY the item(s) that match ALL the specified tags). I have tried doing a variety of JOINs/UNIONs but none of them are giving me the desired effect(usually return ALL the items that match ANY of the tags). I've spent some time looking through the MySQL manual online but I feel like I'm missing something conceptually.


*我说单个查询,因为我当然可以运行一系列简单的WHERE / JOIN查询,每个标签一个,然后组合/排序在PHP中返回的项目或事后的东西,但它似乎是一种愚蠢和低效的方式。给我们一个适当的设置,我觉得有一种我可以用一个MySQL查询的方法。

*I say single query since of course I could just run a series of simple WHERE/JOIN queries, one for each tag and then combine/sort the returned items in PHP or something after the fact but it seems a foolish and inefficient way of doing it. I feel like there is a way I should be able to do this with a single MySQL query, given the appropriate setup.

推荐答案

你的模式看起来相当不错。连接表中不需要ID列,只需从其他表的ID列创建一个主键(尽管参见Marjan Venema的注释和我应该使用复合主键吗?这个替代视图)。以下示例显示如何创建表,添加一些数据并执行所请求的查询。

Your schema is looking fairly good. There’s no need for the ID column in your join table—just create a primary key from the ID columns of the other tables (although see Marjan Venema's comment and Should I use composite primary keys or not? for alternative views on this). The following examples show how you can create the tables, add some data, and perform the queries that you requested.

创建表,完成外键约束。总之,外键约束有助于确保数据库的完整性。在此示例中,如果项目中没有匹配的项目,则它们会阻止项目插入到连接表( item_tag )中。 c>和标签表:

Create tables, complete with foreign key constraints. In short, foreign key constraints help to ensure database integrity. In this example, they prevent items being inserted in the join table (item_tag), if there are no matching items in the item and tag tables:

CREATE  TABLE IF NOT EXISTS `item` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `item` VARCHAR(255) NOT NULL ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS `tag` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `tag` VARCHAR(255) NOT NULL ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS `item_tag` (
  `item_id` INT UNSIGNED NOT NULL ,
  `tag_id` INT UNSIGNED NOT NULL ,
  PRIMARY KEY (`item_id`, `tag_id`) ,
  INDEX `fk_item_tag_item` (`item_id` ASC) ,
  INDEX `fk_item_tag_tag` (`tag_id` ASC) ,
  CONSTRAINT `fk_item_tag_item`
    FOREIGN KEY (`item_id` )
    REFERENCES `item` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT `fk_item_tag_tag`
    FOREIGN KEY (`tag_id` )
    REFERENCES `tag` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

插入一些测试数据:

INSERT INTO item (item) VALUES
('spaniel'),
('tabby'),
('chicken'),
('goldfish');

INSERT INTO tag (tag) VALUES
('bird'),
('pet'),
('dog'),
('cat'),
('reptile'),
('fish'),
('delicious'),
('cheap'),
('expensive');

INSERT INTO item_tag (item_id, tag_id) VALUES
(1,2),
(1,3),
(1,8),
(2,2),
(2,4),
(3,1),
(3,7),
(4,2),
(4,6),
(4,8);

选择所有项目和所有标签:

Select all items and all tags:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id;

+----+----------+-----------+
| id | item     | tag       |
+----+----------+-----------+
|  1 | spaniel  | pet       |
|  1 | spaniel  | dog       |
|  1 | spaniel  | cheap     |
|  2 | tabby    | pet       |
|  2 | tabby    | cat       |
|  3 | chicken  | bird      |
|  3 | chicken  | delicious |
|  4 | goldfish | pet       |
|  4 | goldfish | fish      |
|  4 | goldfish | cheap     |
+----+----------+-----------+

选择具有特定标签的项目:

Select items with a specific tag:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag = 'pet';

+----+----------+-----+
| id | item     | tag |
+----+----------+-----+
|  1 | spaniel  | pet |
|  2 | tabby    | pet |
|  4 | goldfish | pet |
+----+----------+-----+

选择带有一个或多个标签的项目。请注意,这将返回具有 的标签的项目:

Select items with one or more tags. Note that this will return items that have the tags cheap OR pet:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'pet');

+----+----------+-------+
| id | item     | tag   |
+----+----------+-------+
|  1 | spaniel  | pet   |
|  1 | spaniel  | cheap |
|  2 | tabby    | pet   |
|  4 | goldfish | pet   |
|  4 | goldfish | cheap |
+----+----------+-------+

以上查询产生一个您可能不想要的答案,如以下查询所突出显示的那样。在这种情况下,没有使用 标签的项目,但此查询仍返回一些行:

The above query produces an answer that you might not want, as highlighted by the following query. In this case, there are no items with the house tag, but this query still returns some rows:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'house');

+----+----------+-------+
| id | item     | tag   |
+----+----------+-------+
|  1 | spaniel  | cheap |
|  4 | goldfish | cheap |
+----+----------+-------+

您可以通过添加 GROUP BY HAVING

You can fix that by adding GROUP BY and HAVING:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'house')
GROUP BY item.id HAVING COUNT(*) = 2;

Empty set (0.00 sec)

GROUP BY 将使具有相同ID(或您指定的任何列)的所有项目分组到一行,从而有效地删除重复项。 HAVING COUNT 将结果限制为匹配分组行的计数等于2的结果。这确保只返回具有两个标签的项目 - 请注意,此值必须与 IN 子句中指定的标签数量相匹配。以下是一个例子:

GROUP BY causes all items with the same id (or whatever column you specify) to be grouped together into a single row, effectively removing duplicates. HAVING COUNT limits the results to those where the count of the matching grouped rows is equal to two. That ensures that only items with two tags are returned - note that this value must match the number of tags specified in the IN clause. Here’s an example that produces something:

SELECT item.id, item.item, tag.tag
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'pet')
GROUP BY item.id HAVING COUNT(*) = 2;

+----+----------+-----+
| id | item     | tag |
+----+----------+-----+
|  1 | spaniel  | pet |
|  4 | goldfish | pet |
+----+----------+-----+

请注意,在上一个示例中,这些项目已组合在一起,以便您不会重复。在这种情况下,不需要标签列,因为这只是混淆了结果 - 您已经知道有什么标签,因为您已经使用这些标签请求了这些标签。因此,您可以通过从查询中删除标记列来简化某些操作:

Note that in the previous example, the items have been grouped together so that you don’t get duplicates. In this case, there’s no need for the tag column, as that just confuses the results—you already know what tags there are, as you have requested items with those tags. You can therefore simplify things a little by removing the tag column from the query:

SELECT item.id, item.item
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'pet')
GROUP BY item.id HAVING COUNT(*) = 2;

+----+----------+
| id | item     |
+----+----------+
|  1 | spaniel  |
|  4 | goldfish |
+----+----------+

您可以进一步了解,并使用 GROUP_CONCAT 提供匹配标签的列表。这可能很方便,您希望有一个或多个指定标签的项目列表,但不一定全部:

You could go a step further, and use GROUP_CONCAT to provide a list of matching tags. This might be handy where you want a list of items that have one or more of the specified tags, but not necessarily all of them:

SELECT item.id, item.item, GROUP_CONCAT(tag.tag) AS tags
FROM item
JOIN item_tag ON item_tag.item_id = item.id
JOIN tag ON item_tag.tag_id = tag.id
WHERE tag IN ('cheap', 'pet', 'bird', 'cat')
GROUP BY id;

+----+----------+-----------+
| id | item     | tags      |
+----+----------+-----------+
|  1 | spaniel  | pet,cheap |
|  2 | tabby    | pet,cat   |
|  3 | chicken  | bird      |
|  4 | goldfish | pet,cheap |
+----+----------+-----------+

上述模式设计的一个问题是可以输入重复的项目和标签。也就是说,您可以根据需要将 bird 插入标签表中,这不是很好。一种解决方法是将 UNIQUE INDEX 添加到项目标签列。这有助于加快依赖这些列的查询。更新的 CREATE TABLE 命令现在如下所示:

One problem with the above schema design is that it is possible to enter duplicate items and tags. That is, you could insert bird into the tag table as many times as you like, and this is not good. One way to fix that is to add a UNIQUE INDEX to the item and tag columns. This has the added benefit of helping to speed up queries which rely on these columns. The updated CREATE TABLE commands now look like this:

CREATE  TABLE IF NOT EXISTS `item` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `item` VARCHAR(255) NOT NULL ,
  UNIQUE INDEX `item` (`item`) ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS `tag` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `tag` VARCHAR(255) NOT NULL ,
  UNIQUE INDEX `tag` (`tag`) ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

现在,如果您尝试插入重复值,MySQL将阻止您这样做:

Now if you try to insert a duplicate value, MySQL will prevent you from doing so:

INSERT INTO tag (tag) VALUES ('bird');
ERROR 1062 (23000): Duplicate entry 'bird' for key 'tag'

这篇关于这个(规范化的)数据库结构是否允许我按照我的意图搜索标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆