PG在轨道上使用pg_search gem进行子字符串全文搜索 [英] PG full text search on rails using pg_search gem for substring

查看:134
本文介绍了PG在轨道上使用pg_search gem进行子字符串全文搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Pg全文搜索进行搜索。因为我在rails上使用Ruby,所以我使用pg_search gem。

  pg_search_scope:search_by_detail,
:against => ; [
[:first_name,'A'],
[:last_name,'B'],
[:email,'C']
],
:使用=> {
:tsearch => {:前缀=> true}
}

现在,如果子字符串在开始但它会触发它不会给一个命中,如果中间的子字符串



示例它给sdate@example.com命中但不是为example.com $ / b $ b

解决方案

我是pg_search的作者和维护者。不幸的是,PostgreSQL的默认tsearch不分割电子邮件地址,并允许您匹配部分。但是,如果您打开:trigram 搜索,它可能会有效,因为它匹配出现在可搜索文本中任意位置的任意子字符串。

  pg_search_scope:search_by_detail,
:against => [
[:first_name,'A'],
[:last_name,'B'],
[:email,'C']
],
:使用=> {
:tsearch => {:前缀=> true},
:trigram => {}
}

我通过在psql中运行以下命令来确认这一点:

  grant =#选择plainto_tsquery('example.com')@@ to_tsvector('english','name@example.com'); 
?列?
----------
f
(1 row)

我知道解析器确实检测到电子邮件地址,所以我认为这一定是可能的。但它会涉及在PostgreSQL中构建一个文本搜索词典,正确地将电子邮件地址拆分为令牌。



以下证据表明文本搜索解析器知道它是电子邮件地址:

  grant =#SELECT ts_debug('english','name@example.com'); 
ts_debug
----------------------------------------- ------------------------------------
(电子邮件,电子邮件地址,名称@ example.com,{simple},simple,{name@example.com})
(1 row)


I am using Pg full text search for my search . As i am using Ruby on rails, I am using pg_search gem. How do i configure it to give a hit for substring as well.

pg_search_scope :search_by_detail, 
              :against => [
                   [:first_name,'A'],
                   [:last_name,'B'],
                   [:email,'C']
              ],                  
              :using => {
                :tsearch => {:prefix => true}
              }

Right now it gives a hit if the substring is in the start but it wont give a hit if the substring in the middle

example It gives a hit for sdate@example.com but not for example.com

解决方案

I'm the author and maintainer of pg_search.

Unfortunately, PostgreSQL's tsearch by default doesn't split up email addresses and allow you to match against parts. It might work if you turned on :trigram search, though, since it matches arbitrary sub-strings that appear anywhere in the searchable text.

pg_search_scope :search_by_detail,
                :against => [
                  [:first_name,'A'],
                  [:last_name,'B'],
                  [:email,'C']
                ],
                :using => {
                  :tsearch => {:prefix => true},
                  :trigram => {}
                }

I confirmed this by running the following command in psql:

grant=# SELECT plainto_tsquery('example.com') @@ to_tsvector('english', 'name@example.com');
 ?column? 
----------
 f
(1 row)

I know that the parser does detect email addresses, so I think it must be possible. But it would involve building a text search dictionary in PostgreSQL that would properly split the email address up into tokens.

Here is evidence that the text search parser knows that it is an email address:

grant=# SELECT ts_debug('english', 'name@example.com');
                                  ts_debug                                   
-----------------------------------------------------------------------------
 (email,"Email address",name@example.com,{simple},simple,{name@example.com})
(1 row)

这篇关于PG在轨道上使用pg_search gem进行子字符串全文搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆