mysql - 使用全文搜索从文本字段中提取特定的单词 [英] mysql - extract specific words from text field using full text search

查看:1152
本文介绍了mysql - 使用全文搜索从文本字段中提取特定的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与从文本字段中提取特定单词有点类似在mysql中,但现在是相同的。



我有一个带有单词的文本字段。用我的语言可以有很多不同的结局。我需要找到这个结局。



我使用mysql的全文搜索,但是我需要访问索引数据库,其中所有字段都是切入字并计算单词。然后,我可以搜索测试*,我可以快速找到测试,测试,测试。我需要列出数据库中存在的所有端点符号,这是我的主要目标。



因为我可以用特定的test *它,但我不仅需要找到该领域的出现,而且要以某种方式进行分组,以便我列出所有以test开头的单词。我不需要在哪个位置记录它们,只是一个列表,分组以便测试不会被写入10次,而只会被写入一次(也许是找到了多少次但不是必需的计数器)。



有没有一种方法可以从全文搜索字段中提取此信息,还是应该将所有这些字段分解为单词并使索引表充满单词,并且只是做一个like单词%并且我不确定如何做到这一点,无论是在实践中,但只是为了指出我正确的方向请。



所以总结:我有一个文本结果,我需要找出哪些单词是以测试开始的,比如测试,测试,测试等等......在英语中没有意义,但用我的语言因为我们在不同的endign上有相同的词,并且有很多这样的词,有时候20,我需要找出哪些词在那里,这样我就可以创建一个synonims表; - )



更新:

数据库有列ID(int),成分(te

成分中的数据是具有不同结尾的烹饪原料,如:

1鸡蛋
2个鸡蛋


解决方案

您可以转储索引中存在的所有单词。这也会显示每个单词的频率。例如。测试使用了200次,测试使用了300次。

使用手册: http://dev.mysql.com/doc/refman/5.0/zh/myisam-ftdump.html


My question is a little simillar to Extract specific words from text field in mysql, but now the same.

I have a text field with words inside. In my language word can have many different endings. I need to find this endings.

I use fulltext search of mysql, but I would need to have access to the index database where all the field is "cut" to words and words are counted. I could then search for "test*" and I could quickly find "test", "tested", "testing". I need the list of all endigns that exist in my database, that is my primary goal.

As it is I can get the records with specific "test*" words in it, but I need not only to locate the occurence in the field, but to group somehow so I get the list of all the words that for example start with "test". I don't need location in which record they are, just a list, grouped so that "testing" is not written 10 times but only once (maybe a counter of how many times it is found but not necessary).

Is there a way to extract this info from fulltextsearch field or should I explode all this fields to words and make a index table full of words and just do a "like "word%" and group by the different results? I am not sure how to do that either in practice, but just to point me to the right direction please.

So to summarize: I have a text fied and I need to find out which words are inside that start with "test", like "tested", "test", "testing" etc... It doesn't make sense in English but in my language it does as we have same word on different endigns and there are so many of them, somethimes 20, I need to find out which ones are there so I can make a synonims table ;-)

UPDATE:

Database has columns ID (int), ingredients (text) and recipe (text).

Data in ingredients are cooking ingredients with different endings like:

1 egg 2 eggs

etc.

解决方案

You can dump all words that are present in an index. And that would also show frequency of each word. E.g. test is used 200 times and testing is used 300 times.

Manual for that: http://dev.mysql.com/doc/refman/5.0/en/myisam-ftdump.html

这篇关于mysql - 使用全文搜索从文本字段中提取特定的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆