Mysql和BOOLEAN MODE(FULLTEXT)搜索 [英] Mysql like and BOOLEAN MODE (FULLTEXT) search

查看:950
本文介绍了Mysql和BOOLEAN MODE(FULLTEXT)搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个查询以使用通配符搜索记录.

我下面有两个查询哪个可行,但我想知道哪个查询更乐观.

查询一个确实提供了我想要的东西,但查询两个却给出了不同的结果.

我应该使用哪个.

  1. 在查询中使用赞".

    SELECT code, name
    FROM countryCounty
    WHERE name LIKE  '%County Down%'
    AND isActive =1
    AND countryISO2FK =  'GB'
    LIMIT 1
    

  2. 然后我有布尔模式(FULLTEXT)查询.

    SELECT code,name, match( name )
    AGAINST ( 'County Down' IN BOOLEAN MODE ) AS relevance
    FROM opjb_countryCounty 
    WHERE match( name ) AGAINST ( '%County Down%' IN BOOLEAN MODE ) 
        AND isActive=1 
        AND countryISO2FK='GB' 
    ORDER BY relevance DESC LIMIT 1
    

解决方案

两个查询之间存在显着差异.

第一个查询是在名称列中搜索单个字符串'County Down'的出现.

第二个查询正在搜索文本中两个单独的单词(单独的字符串)'County''Down'之一的出现. (在第二个查询中,在'County'之前的'%'字符的用途和作用对我来说是未知的.)

BOOLEAN MODE全文搜索的相关性将为1.0.如果您只想返回同时包含单词"County"和"Down"的行,那么您真的想在每个单词之前使用'+'限定词,例如:

MATCH(name) AGAINST('+County +Down' IN BOOLEAN MODE)

请注意,该谓词还将与包含'Some Down and out County'的名称匹配",例如,第一个查询不会.


此外,用于按相关性排序结果集的方法几乎是正确的.有一个微妙的问题:包括IN BOOLEAN MODE修饰符会使表达式返回1.0,而不是像NATURAL LANGUAGE MODE那样返回加权浮点数.


要回答您的问题:如果第一个查询返回的是您需要的结果集,请使用该查询.该查询的缺点是该查询中的LIKE谓词不可保留,也就是说,MySQL无法利用索引范围扫描来满足该谓词. (索引可用于其他谓词,但需要检查这些行中每行的name列.

第二种形式的查询的优势在于,如果创建了FULLTEXT INDEX,它可以利用FULLTEXT INDEX,从而可以提高性能.

I'm trying to write a query to search for a record using a wild card.

I have two queries below which works but I like to know which one of the is more optimise.

Query one does gives me what i'm looking for but query two gives me different results.

Which one I should be using.

  1. Using Like in my query.

    SELECT code, name
    FROM countryCounty
    WHERE name LIKE  '%County Down%'
    AND isActive =1
    AND countryISO2FK =  'GB'
    LIMIT 1
    

  2. Then I have boolean mode (FULLTEXT) query.

    SELECT code,name, match( name )
    AGAINST ( 'County Down' IN BOOLEAN MODE ) AS relevance
    FROM opjb_countryCounty 
    WHERE match( name ) AGAINST ( '%County Down%' IN BOOLEAN MODE ) 
        AND isActive=1 
        AND countryISO2FK='GB' 
    ORDER BY relevance DESC LIMIT 1
    

解决方案

There's a significant difference between the two queries.

The first query is searching for an occurrence of the single string 'County Down' within the name column.

The second query is searching for occurrences of either of the two separate words (separate strings) 'County' and 'Down' within the text. (The purpose and effect of that '%' character before 'County' in that second query is unknown to me.)

The relevance from a BOOLEAN MODE fulltext search is going to be 1.0. If you want to return only those rows that have both the words 'County' and 'Down', then you'd really want to use the '+' qualifier before each word, for example:

MATCH(name) AGAINST('+County +Down' IN BOOLEAN MODE)

Note that this predicate will also "match" to a name containing 'Some Down and out County', for example, where the first query would not.


Also, the approach used to get the result set ordered by relevance is almost right. There's a subtle problem: including IN BOOLEAN MODE modifier causes the expression return 1.0, instead of returning the weighted float as would be returned with NATURAL LANGUAGE MODE.


To answer your question: if the first query is returning the result set you need, then use that query. The downside of that query is that the LIKE predicate in that query is not sargable, that is, MySQL can't make use of a index range scan to satisfy that predicate. (An index may be used for the other predicates, but that name column on each of those rows needs to be checked.

The advantage of a query of the second form is that it can make use of a FULLTEXT INDEX, if one is created, which can improve performance.

这篇关于Mysql和BOOLEAN MODE(FULLTEXT)搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆