MySQL全文Stopwords原理 [英] MySQL Fulltext Stopwords Rationale

查看:138
本文介绍了MySQL全文Stopwords原理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在为我的网站开发一个基本的全文搜索,并且我注意到像有关这样的某些词被列为MySQL全文搜索的停用词。由于搜索给定新闻项目的人不一定会使用关于这个词搜索(但我当然不能为所有人说话),所以这并不会让我感到困扰。不过,我希望这里有人能够启发我关于制定停用词表的理由。谢谢!



澄清:
MyIsam用于全文表格。停用词是MySQL不会索引的词(适用于任何全文索引)。正如对这个问题的评论所指出的那样,没有任何形式的解释就有完整的停用词表。我只想知道是否有他们选择的单词背后的基本原理。 解决方案

英语中的常用词。在大多数情况下,如果您不索引这些单词,那么您的搜索结果将更具相关性 - 并且您的索引将更小更快。



您可以编辑如果更适合您的需求,则使用ft_stopword_file变量(或将其设置为''以将所有单词索引为长或长于ft_min_word_len的索引)的停用词列表。您也可以使用ft_min_word_len变量来更改最小索引字长度,该变量出于同样的原因。


I am currently trying to develop a basic fulltext search for my website, and I noticed that certain words like "regarding" are listed as stopwords for MySQL fulltext searches. This doesn't bother me too much right now since people searching for a given news item wouldn't necessarily search using the word "regarding" (but I certainly can't speak for everyone!). However, I was hoping someone here could enlighten me about the rationale for having a stopwords list. Thanks!

For Clarification: I'm using MyIsam for my fulltext table. The stopwords are words that MySQL won't index (for any fulltext index). As noted in a comment to this question, there is a full list of stopwords without any kind of explanation. I'd just like to know if there was a rationale behind the words "they" chose.

解决方案

The stop words are just common words in the English language. In most cases, your search results will be more relevant -- and your indices will be smaller and faster -- if you don't index these words.

You can edit the stop word list using the ft_stopword_file variable (or set it to '' to index all words as long or longer than ft_min_word_len) if that suits your needs better. You can also change the minimum indexed word length using the ft_min_word_len variable, which exists for the same reason.

这篇关于MySQL全文Stopwords原理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆