如何为模糊和右锚搜索索引PostgreSQL JSONB平面文本数组? [英] How to index a PostgreSQL JSONB flat text array for fuzzy and right-anchored searches?

查看:69
本文介绍了如何为模糊和右锚搜索索引PostgreSQL JSONB平面文本数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PostgreSQL版本:9.6.

PostgreSQL version: 9.6.

events表具有visitors JSONB列:

The events table has a visitors JSONB column:

CREATE TABLE events (name VARCHAR(256), visitors JSONB);

visitors列包含一个平面" JSON数组:

The visitors column contains a "flat" JSON array:

["John Doe","Frédéric Martin","Daniel Smith",...].

events表包含1000万行,每行有1到20位访问者.

The events table contains 10 million of rows, each row has between 1 and 20 visitors.

是否可以索引数组的值以执行有效的模式匹配搜索:

Is it possible to index the values of the array to perform efficient pattern-matching searches:

  1. 左锚定:选择访问者与"John%"匹配的事件
  2. 右锚定:选择访问者与%Doe"匹配的事件
  3. 不重音:选择访问者与"Frederic%"匹配的事件
  4. 不区分大小写:选择访问者与"john%"匹配的事件
  1. left anchored: select events whose visitors match 'John%'
  2. right anchored: select events whose visitors match '%Doe'
  3. unaccented: select events whose visitors match 'Frederic%'
  4. case-insensitive: select events whose visitors match 'john%'

我知道Postgres trigram扩展名gin_trgm_ops的存在,该扩展名可以为不区分大小写和右锚的搜索创建索引,但是我不知道如何为"flat"的内容创建trigram索引JSON数组.

I am aware of the existence of the Postgres trigram extension gin_trgm_ops enabling to create indexes for case-insensitive and right-anchored searches, but I can't figure out how to create trigram indexes for the content of "flat" JSON arrays.

我阅读了关于jsonb键/值的模式匹配在JSON数组中查找元素的索引,但提供了解决方案似乎不适用于我的用例.

I read Pattern matching on jsonb key/value and Index for finding an element in a JSON array but the solutions provided do not seem to apply to my use case.

推荐答案

您应将jsonb强制转换为text并在其上创建三字母组索引:

You should cast the jsonb to text and create a trigram index on it:

CREATE EXTENSION pg_trgm;
CREATE INDEX ON events USING gin
   ((visitors::text) gin_trgm_ops);

然后在该列上使用正则表达式搜索.例如,要搜索John Doe,您可以使用:

Then use regular expression searches on the column. For example, to search for John Doe, you can use:

SELECT ...
FROM events
WHERE visitors::text *~ '\mJohn Doe\M';

trigram索引将支持此查询.

The trigram index will support this query.

这篇关于如何为模糊和右锚搜索索引PostgreSQL JSONB平面文本数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆