PostgreSQL LIKE查询性能变化 [英] PostgreSQL LIKE query performance variations

查看:160
本文介绍了PostgreSQL LIKE查询性能变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直看到有关 LIKE 查询数据库中特定表的响应时间差异很大。有时我会在200-400毫秒内得到结果(非常接受),但有时可能需要30秒才能返回结果。

I have been seeing quite a large variation in response times regarding LIKE queries to a particular table in my database. Sometimes I will get results within 200-400 ms (very acceptable) but other times it might take as much as 30 seconds to return results.

我明白 LIKE 查询是非常耗费资源的,但我不明白为什么响应时间会有这么大的差异。我在 owner1 字段上构建了一个btree索引,但我不认为它对 LIKE 查询有帮助。有什么想法?

I understand that LIKE queries are very resource intensive but I just don't understand why there would be such a large difference in response times. I have built a btree index on the owner1 field but I don't think it helps with LIKE queries. Anyone have any ideas?

示例SQL:

SELECT gid, owner1 FORM parcels
WHERE owner1 ILIKE '%someones name%' LIMIT 10

我是还试过:

SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('%someones name%') LIMIT 10

并且:

SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('someones name%') LIMIT 10

有类似结果。

表行数:约95,000。

With similar results.
Table Row Count: about 95,000.

推荐答案

FTS不支持 LIKE



以前接受的答案不正确。 全文搜索及其全文索引对于 LIKE 运算符,它有自己的运算符,不适用于任意字符串。它基于词典和词干来运行单词 支持前缀匹配单词,但不支持 LIKE 运算符:

FTS does not support LIKE

The previously accepted answer was incorrect. Full Text Search with its full text indexes is not for the LIKE operator at all, it has its own operators and doesn't work for arbitrary strings. It operates on words based on dictionaries and stemming. It does support prefix matching for words, but not with the LIKE operator:

  • Get partial match from GIN indexed TSVECTOR column

安装附加模块 pg_trgm GIN和GiST三元组索引以支持所有 LIKE ILIKE 模式,而不仅仅是左锚定的:

Install the additional module pg_trgm which provides operator classes for GIN and GiST trigram indexes to support all LIKE and ILIKE patterns, not just left-anchored ones:

示例索引:

CREATE INDEX tbl_col_gin_trgm_idx  ON tbl USING gin  (col gin_trgm_ops);

或:

CREATE INDEX tbl_col_gist_trgm_idx ON tbl USING gist (col gist_trgm_ops);




  • GiST和GIN指数之间的差异

    • Difference between GiST and GIN index
    • 示例查询:

      SELECT * FROM tbl WHERE col LIKE '%foo%';   -- leading wildcard
      SELECT * FROM tbl WHERE col ILIKE '%foo%';  -- works case insensitively as well



      Trigrams?那么短字符串呢?



      索引值中少于3个字母的单词仍然有效。 手册:

      Trigrams? What about shorter strings?

      Words with less than 3 letters in indexed values still work. The manual:


      在确定字符串中包含的三元组时,每个单词被认为有两个空格前缀,一个空格
      后缀。

      Each word is considered to have two spaces prefixed and one space suffixed when determining the set of trigrams contained in the string.

      搜索少于3个字母的模式? 手册:

      And search patterns with less than 3 letters? The manual:


      对于 LIKE 和正则表达式搜索,请记住,没有可提取三元组的
      模式将退化到全索引扫描。

      For both LIKE and regular-expression searches, keep in mind that a pattern with no extractable trigrams will degenerate to a full-index scan.

      意思是,索引/位图索引扫描仍然有效(预处理语句的查询计划不会中断),它只是不会给你带来更好的表现。通常没有大的损失,因为单字母或双字母字符串几乎没有选择性(超过基础表匹配的几个百分点)并且索引支持不会提高性能,因为全表扫描更快。

      Meaning, that index / bitmap index scans still work (query plans for prepared statement won't break), it just won't buy you better performance. Typically no big loss, since 1- or 2-letter strings are hardly selective (more than a few percent of the underlying table matches) and index support would not improve performance to begin with, because a full table scan is faster.


      对于左锚定模式(没有前导通配符),您可以使用合适的运算符类 text_pattern_ops varchar_pattern_ops 。标准Postgres的内置功能,无需额外的模块。类似的表现,但指数要小得多。

      For just left-anchored patterns (no leading wildcard) you get the optimum with a suitable operator class for a btree index: text_pattern_ops or varchar_pattern_ops. Both built-in features of standard Postgres, no additional module needed. Similar performance, but much smaller index.

      示例指数:

      CREATE INDEX tbl_col_text_pattern_ops_idx ON tbl(col text_pattern_ops);

      示例查询:

      SELECT * FROM tbl WHERE col LIKE 'foo%';  -- no leading wildcard

      ,如果您应该使用'C'语言环境(实际上没有语言环境),然后所有内容都按照字节顺序排序,并且带有默认运算符类的普通btree索引可以完成这项任务。

      Or, if you should be running your database with the 'C' locale (effectively no locale), then everything is sorted according to byte order anyway and a plain btree index with default operator class does the job.

      关于dba.SE的这些相关答案的更多细节,解释,示例和链接:

      More details, explanation, examples and links in these related answers on dba.SE:

      • Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
      • How is LIKE implemented?
      • Finding similar strings with PostgreSQL quickly

      这篇关于PostgreSQL LIKE查询性能变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆