标签系统:Toxi解决问题 [英] Tagging system: Toxi solution questions

查看:153
本文介绍了标签系统:Toxi解决问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在打破Toxi解决方案的标签数据库模式。我正在使用用户可以提交项目的系统,这些项目可以具有与其相关联的标签。阅读标签后,我找到了最适合我需要的Toxi解决方案。但是,我不完全确定我是否正在规划这个权利,所以我想要你的意见。



我会有三个数据库。
$ 项目包含 item_id 和其他人

tagmap 使用 item_id tag_id 作为外键

标签包含 tag_id tag_text



添加新项目时,我是否愿意将标记添加到数据库中的过程如下?


  1. 为数组中的每个标签将提交的标签排序到数组



    1. 从标签中获取tag_id tag_text匹配当前标签

    2. 如果返回0行:


      1. 将标签添加到标签表

      2. 获取tag_id


    3. 将item_id和tag_id添加到tagmap


  2. 完成(给用户一个好的等等)

这意味着我们最后会有一个条目在每个项目的每个标签的标签图中。这似乎是正确的,但我不禁想到有一个更好的方式来做,而不是结束大量的条目...



至于编辑标签,我已经想到了以下过程,虽然我认为有一个更好的方法,我还没有发现。


  1. 获取使用item_id并插入可编辑字段的标签

  2. 用户进行更改。提交:

  3. 从tagmap中删除行,其中item_id与正在编辑的项匹配。

  4. 与上述相同的过程

我真的很喜欢关于第3点的地方。有没有办法检查是否有任何标签已被删除,所以我可以选择性地删除标签,而不是删除和重新添加它们?
只是为了确保:删除tagmap行时,相关项目将不会被删除,因为它指向一个外键而不是一个外键,对吗?



另外,我可能想要跟踪标签的使用次数,但是我不想在每次需要显示时运行查询来计数。我正在考虑使用cron作业计算每个小时或每小时一次的tagmap中每个tag_id的实例数,然后更新tags表中的tag_use值。这是一个正确的方法,还是有更好的方法?



回头看,这是非常大量的文本。 Welp比缺少信息的细节要细,而是提出太多的问题,并且学习很多新东西,而不是询问太少。
很好的机会我今天花了太多时间研究一下,明天会变得更有意义。



提前感谢! p>

解决方案

首先,toxi不是一个标准术语。始终定义您的条款!或者至少提供相关的链接。



现在到问题本身...


我有三个数据库。


不,你会有3个表。


添加新项目时


你几乎在正确的轨道上,除了你可以使用SQL的基于集合的性质来合并许多这些步骤。例如,使用标签标记项目1:'tag1','tag2'和'tag3'可以这样做...

  INSERT IGNORE INTO tagmap(item_id,tag_id)
SELECT 1,tag_id FROM tags WHERE tag_text IN('tag1','tag2','tag3');

IGNORE 允许此功能成功,即使项目已经连接到这些标签中的一些。



这假定所有必需的标签都已经在标签中。假设 tag.tag_id 是自动递增的,你可以这样做,以确保它们是:

  INSERT IGNORE INTO标签(tag_text)VALUES('tag1'),('tag2'),('tag3'); 




这意味着我们将在tagmap中得到一个条目每个项目的每个标签。这似乎是正确的,但我不禁想到有一个更好的方法,然后结束了大量的条目...


没有魔法。如果项目连接到特定标签是您想要记录的知识,那么它将在数据库中具有某种物理表示。


关于编辑标签...


您的意思是重新标记项目(不修改标签本身)?



要删除列表中不存在的所有标签,请执行以下操作:

  DELETE FROM tagmap 
WHERE
item_id = 1
AND tag_id NOT IN(
SELECT tag_id FROM tags
WHERE tag_text IN('tag1','tag3')
);

这将断开该项目与除tag1和tag3之外的所有标签。执行上面的INSERT和这个DELETE一个接一个地覆盖添加和删除标签。



你可以玩所有这一切在。



在乍一看,toxi可能看起来像是节省空间,但实际上实际上并不是这样,因为它需要额外的表和索引(而且标签往往很短)。


此外,我可能想跟踪标签... cron工作的次数...


在决定做这样的事情之前先测量。上面提到的我的SQL Fiddle在 tagmap PK中使用了一个非常有意义的字段顺序,所以数据集中在一种对这种计数非常友好的方式(记住: InnoDB表被聚集)。



无论如何, measure

强>实际数据量!


I'm sort of breaking my head over the Toxi solution for tag database schemas. I'm working on a system to which users can submit items, and those items can have tags associated with them. After reading up on tagschemas, I found the Toxi solution to suit my needs most. However, I'm not entirely sure if I'm planning this right, so I'd like your opinions on this please.

I'll have three databases.
items containing item_id and others
tagmap using item_id and tag_id as foreign keys
tags containing tag_id and tag_text

When adding a new item, am I right to assume the process to add the tags to the database is as follows?

  1. sort submitted tags into array
  2. for every tag in the array:

    1. get tag_id from tags where tag_text matches the current tag
    2. if that returns 0 rows:

      1. add tag to tags table
      2. get the tag_id

    3. add item_id and tag_id to tagmap

  3. finish (give user the a-okay, etc)

This means we'll end up with an entry in the tagmap for every tag for every item. It seems correct, but I can't help but think there's a better way to do that than ending up with a huge amount of entries there...

As for editing the tags, I've thought up the following process, though I think there's a better way which I haven't found yet.

  1. get tags using item_id and insert into editable field
  2. user makes changes. on submit:
  3. delete rows from tagmap where item_id matches the one being edited
  4. same process as the one listed above

I'm kind of iffy about point 3 there. Is there a way for me to check if any tags have been removed, so I can selectively delete tags instead of just deleting and re-adding them? And just to be sure: when deleting tagmap rows, the related items won't be deleted with it because it points to a foreign key instead of acting as one, right?

Also, I may want to keep track of the amount of times a tag is used, but I don't want to run a query to count those every time it needs to be displayed. I'm thinking of having a cron job count the number of instances for every tag_id in tagmap once every hour, or bihourly, and then update the tag_use value in the tags table. Is that a correct way to do it, or is there a better way?

Looking back, that's quite the hefty amount of text. Welp, rather too detailed than missing information, and rather asking too many questions and learning a lot of new things than asking too few. Good chance I've just spent too much time looking into this today, and it'll all make more sense tomorrow.

Thanks in advance!

解决方案

First of all "toxi" is not a standard term. Always define your terms! Or at least provide relevant links.

And now to the question itself...

I'll have three databases.

No, you'll have 3 tables.

When adding a new item...

You are pretty much on the right track, with the exception that you can use the set-based nature of SQL to "merge" many of these steps. For example, tagging an item 1 with tags: 'tag1', 'tag2' and 'tag3' can be done like this...

INSERT IGNORE INTO tagmap (item_id, tag_id)
SELECT 1, tag_id FROM tags WHERE tag_text IN ('tag1', 'tag2', 'tag3');

The IGNORE allows this to succeed even if item is already connected to some of these tags.

This assumes all required tags are already in tags. Assuming tag.tag_id is auto-increment, you can do something like this to ensure they are:

INSERT IGNORE INTO tags (tag_text) VALUES ('tag1'), ('tag2'), ('tag3');

This means we'll end up with an entry in the tagmap for every tag for every item. It seems correct, but I can't help but think there's a better way to do that then ending up with a huge amount of entries there...

There is no magic. If "item is connected to a particular tag" is piece of knowledge you want to record, then it will have to have some sort of physical representation in the database.

As for editing the tags...

You mean re-tagging items (not modifying tags themselves)?

To remove all tags that are not in the list, do something like this:

DELETE FROM tagmap
WHERE
    item_id = 1
    AND tag_id NOT IN (
        SELECT tag_id FROM tags
        WHERE tag_text IN ('tag1', 'tag3')
    );

This will disconnect the item from all tags except 'tag1' and 'tag3'. Execute the INSERT above and this DELETE one after another to "cover" both adding and removing tags.

You can play with all this in the SQL Fiddle.

And just to be sure: when deleting tagmap rows, the related items won't be deleted with it because it points to a foreign key instead of acting as one, right?

Correct. A child endpoint of a FK will not trigger a referential action (such as ON DELETE CASCADE), only parent will.

BTW, you are using this schema because you want additional fields in tags (beside tag_text), right? If you do, not loosing this additional data just because all connections are gone is desired behavior.

But if you just wanted the tag_text, you'd use a simpler schema where deleting all connections would be the same as deleting the tag itself:

This would not just simplify the SQL, it would also provide better clustering.

At first glance, "toxi" might look like it's saving space, but this might actually not be the case in practice, since it requires additional tables and indexes (and tags tend to be short).

Also, I may want to keep track of the amount of times a tag ... cron job ...

Measure before you decide to do something like this. My SQL Fiddle mentioned above uses a very deliberate order of fields in the tagmap PK, so data is clustered in a way very friendly to this kind of counting (remember: InnoDB tables are clustered). You'd have to have a tryly huge amount of items (or require unusually high performance) before this becomes a problem.

In any case, measure on realistic amounts of data!

这篇关于标签系统:Toxi解决问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆