MySQL全文搜索中的通配符搜索 [英] Wildcard search in MySQL full-text search

查看:324
本文介绍了MySQL全文搜索中的通配符搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在MySQL中使用全文搜索进行查询,以便我们可以得到如下结果:


nited获得统一,而oogle获取google




正如我们可以使用LIKE运算符:%nited和% oogle

解决方案

不幸的是,您无法使用MySQL全文索引来执行此操作。您不能从索引立即检索'nited states',因为左边的字符是索引中最重要的部分。不过,您可以搜索'United Sta *'

  // MySQL中唯一可能的通配符全文搜索
WHERE MATCH(column)AGAINST('United Sta *'IN BOOLEAN MODE)

当在句子中搜索整个单词时,MySQL的全文表现最好 - 即使这样可能会吸引不少时间。否则,我建议使用外部全文引擎,如 Solr 狮身人面像。你可以回到MySQL的 LIKE 子句,但同样,运行像 LIKE'%nited states' LIKE'%nited Stat%'在性能上也受到影响,因为它不能在前几个字符上使用索引。 'United Sta%''Unit%States'都可以,因为索引可以用于对付第一批使用MySQL的全文索引的另一个主要警告是停用词列表和 doc / refman / 5.1 / en / fulltext-fine-tuning.htmlrel =noreferrer>最小字长设置。例如,在共享主机环境中,您将被限制为大于或等于4个字符的单词。因此,搜索'Goo'以获得'Google'将会失败。 停用词表也禁止使用常见词语,例如和','也许'和'外部' - 实际上,共有548个停用词汇!同样,如果不使用共享主机,这些设置相对容易修改,但如果您是这样,那么您会因某些默认设置而感到恼火。


How to query in MySQL using full-text search so we can get result like bellow:

nited gets united, And oogle gets google


As we can do using LIKE operator: %nited and %oogle

解决方案

Unfortunately you cannot do this using a MySQL full-text index. You cannot retrieve '*nited states' instantly from index because left characters are the most important part of the index. However, you can search 'United Sta*'.

// the only possible wildcard full-text search in MySQL
WHERE MATCH(column) AGAINST ('United Sta*' IN BOOLEAN MODE)

MySQL's full-text performs best when searching whole words in sentences - even that can suck at times. Otherwise, I'd suggest using an external full-text engine like Solr or Sphinx. I think Sphinx allows prefix and suffix wildcards, not sure about the others.

You could go back to MySQL's LIKE clause, but again, running queries like LIKE '%nited states' or LIKE '%nited Stat%', will also suffer on performance, as it can't use the index on the first few characters. 'United Sta%' and 'Unit%States' are okay as the index can be used against the first bunch of known characters.

Another quite major caveat using MySQL's full-text indexing is the stop-word list and minimum word length settings. For example, on a shared hosting environment, you will be limited to words greater than or equal to 4-characters. So searching 'Goo' to get 'Google' would fail. The stop-word list also disallows common words like 'and', 'maybe' and 'outside' - in-fact, there are 548 stop-words all together! Again, if not using shared hosting, these settings are relatively easily to modify, but if you are, then you will get annoyed with some of the default settings.

这篇关于MySQL全文搜索中的通配符搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆