在数据库mysql中存储和检索同义词的最佳方法 [英] Best way to store and retrieve synonyms in database mysql

查看:312
本文介绍了在数据库mysql中存储和检索同义词的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个同义词列表,将其存储在数据库中并在进行全文搜索之前对其进行检索.

I am making a synonyms list that I will store it in database and retrieve it before doing full text search.

当用户输入像:word1

When users enters like: word1

我需要在同义词表中查找此单词.因此,如果找到了该单词,我将选择该单词的所有同义词,并在下一个查询的全文搜索中使用它,在该查询中我将查询构造为

I need to lookup for this word in my synonyms table. So if the word is found, I would SELECT all the synonyms of this word and use it in the fulltext search on the next query where I contruct the query like

反对匹配(列名)((布尔模式下的(word1a word1b word1c))

MATCH (columnname) AGAINST ((word1a word1b word1c) IN BOOLEAN MODE)

那么如何将同义词存储在表中?我找到2个选择:

So how do I store the synonyms in a table? I found 2 choices:

  1. 使用关键字和单词列,例如

  1. using key and word columns like

val  keyword
-------------
1    word1a
1    word1b
1    word1c
2    word2a
2    word2b
3    word3a
etc.

因此,我可以在一个查询中找到输入单词的精确匹配,并找到它的ID.在下一个选择中,我得到所有具有该ID的单词,并以某种方式使用服务器端语言中的记录集循环将它们连接起来.然后,我可以在需要查找单词的主表上构建真正的搜索.

So then I can find exact match of the entered word in one query and find it's ID. In the next select I get all the words with that ID and somehow concate them using a recordset loop in server side langauge. I can then construct the real search on the main table that I need to look for the words.

  1. 仅使用单词列,例如

  1. using only word columns like

word1a|word1b|word1c
word2a|word2b|word2c
word3a

现在,我将单词的SELECT(如果它位于任何记录内)(如果包含)提取所有记录并在|处将其爆炸.然后我再次说了我可以使用的话.

Now I so the SELECT for my word if it is inside any record, if it is, extract all the record and explode it at | and I have my words again that I can use.

对于使用这种方法创建同义词数据库的人来说,第二种方法看起来更易于维护,但是我看到了两个问题:

This second approach lookes easier to maintain for the one who would make this database of synonyms, but I see 2 problems:

a)如果在字符串中有一个单词,如何在mysql中查找?我不能像'word1a'那样使用它,因为同步词的方式非常相似,其中word1a可能是strowberry,strowberry可能是鸟,而word 2a可能是berry.显然我需要完全匹配,那么LIKE语句如何在字符串内完全匹配?

a) How do I find in mysql if a word is inside the string? I can not LIKE 'word1a' it because synonims can be very alike in a way word1a could be strowberry and strowberries could be birds and word 2a could be berry. Obviously I need exact match, so how could a LIKE statement exact match inside a string?

b)我看到一个速度问题,使用LIKE我想比第一个使用我完全匹配一个单词"的方法要花费更多的mysql而不是"=".另一方面,在第一个选项中,我需要2条语句,一个语句获得单词的ID,第二个语句获得具有该ID的所有单词.

b) I see a speed problem, using LIKE would I guess take more mysql take than "=" using the first approach where I exact match a word. On the other hand in the first option I need 2 statements, one to get the ID of the word and second to get all the words with this ID.

您将如何解决这个问题,更多的是采取哪种方法的困境?有第三种我不认为管理员容易添加/编辑同义词并且同时又快速又优化的方法吗?好的,我知道通常没有最好的方法;-)

How would you solve this problem, more of a dilemma which approach to take? Is there a third way I don't see that is easy for admin to add/edit synonyms and in the same time fast and optimal? Ok I know there is no best way usually ;-)

更新:在我的情况下,不能使用两个表的解决方案(一个用于主词,第二个用于同义词).因为我没有用户在搜索字段中键入的MASTER字.他可以在该字段中键入任何同义词,因此我仍然想知道如何设置此表,因为我没有一个单词会在一个表中具有ID的主词,而在第二个表中却具有主ID的同义词.没有主语.

UPDATE: The solution to use two tables one for master word and second for the synonym words will not work in my case. Because I don't have a MASTER word that user types in search field. He can type any of the synonyms in the field, so I am still wondering how to set this tables as I don't have master words that I would have ID's in one table and synonims with ID of the master in second table. There is no master word.

推荐答案

不要使用(一个)字符串来存储不同的条目.

Don't use a (one) string to store different entries.

换句话说:建立一个单词表(word_ID,单词)和一个同义词表(word_ID,synonym_ID),然后将该单词添加到单词表中,每个同义词添加一个条目到同义词表中.

In other words: Build a word table (word_ID,word) and a synonym table (word_ID,synonym_ID) then add the word to the word table and one entry per synonym to the synonyms table.

更新 (添加了第三个同义词)

UPDATE (added 3rd synonym)

您的单词表必须包含每个单词(ALL),您的同义词表仅包含指向同义词的指针(而不是单个单词!).

Your word table must contain every word (ALL), your synonym table only holds pointers to synonyms (not a single word!) ..

如果您有三个词:A,B和C,它们是同义词,则您的数据库将是

If you had three words: A, B and C, that are synonyms, your DB would be

WORD_TABLE            SYNONYM_TABLE
ID | WORD             W_ID | S_ID
---+-----             -----+-------
1  | A                  1  |  2
2  | B                  2  |  1
3  | C                  1  |  3
                        3  |  1
                        2  |  3
                        3  |  2  

不要担心SYNONYM_TABLE中的许多条目,它们将由计算机管理,并且需要反映单词之间的现有关系.

Don't be afraid of the many entries in the SYNONYM_TABLE, they will be managed by the computer and are needed to reflect the existing relations between the words.

第二种方法

您可能也很想(我认为您不应该!)使用一个表,该表具有单独的单词字段和同义词(或ID)列表(word_id,word,synonym_list).注意,这与关系数据库的工作方式相反(一个字段,一个事实).

You might also be tempted (I don't think you should!) to go with one table that has separate fields for word and a list of synonyms (or IDs) (word_id,word,synonym_list). Beware that that is contrary to the way a relational DB works (one field, one fact).

这篇关于在数据库mysql中存储和检索同义词的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆