Text []数组列的表索引 [英] Table indexes for Text[] array columns

查看:123
本文介绍了Text []数组列的表索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个PostgreSQL数据库表,其中定义了 text [] (array)列。我正在使用这些列以这种方式在数据库中搜索特定的记录:

 从业务中选择obj 
其中(('('street'= ANY(address_line_1)
和'a_city'=任何(城市)
和'a_state'=任何(州))
或('street'=任何(address_line_1)
和'1234'= ANY(zip_code))
和('a_business_name'= ANY(business_name)
或'a_website'= ANY(website_url)
或数组['123']&&&phone-umbers))

我遇到的问题是有大约100万条记录,查询真的很慢。我的问题很简单,数组列有不同类型的索引?有没有人知道在这种情况下创建的最佳类型的索引? (假设有不同的类型)。



以下情况,这是 explain analysis response:

 Seq Scan on business(cost = 0.00..207254.51 rows = 1 width = 32)(实际时间= 18850.462..18850.462 rows = 0循环= 1)
过滤器:(('a':: text = ANY(address_line_1))AND(('a':: text = ANY(business_name))OR('a' = ANY(website_url))OR('{123}':: text []&&&&phone-umbers))AND((('a':: text = ANY(city))AND('a':: text = ANY(state)))OR('1234':: text = ANY(zip_code))))
由过滤器删除的行:900506
总运行时间:18850.523 ms

提前感谢

解决方案

您可以使用 GIN索引 有效地帮助性能与数组。

结合使用它与 数组运算符



例如:

  CREATE INDEX business_address_line_1_idx开启业务使用GIN(address_line_1); 

为条件中涉及的所有数组列执行此操作。



可能值得考虑的是规范化您的架构。也许将多个条目分割成单独的(1:n或n:m)表可以更好地为您服务。从长远来看,即使起初看起来更像是更多的工作,通常也是这样。


I have a PostgreSQL database table with text[] (array) columns defined on it. I'm using these columns to search for a specific record in the database in this way:

select obj from business
where ((('street' = ANY (address_line_1)
    and 'a_city' = ANY (city)
    and 'a_state' = ANY (state))
or    ('street' = ANY (address_line_1)
    and '1234' = ANY (zip_code)))
and ('a_business_name' = ANY (business_name)
    or 'a_website' = ANY (website_url)
    or array['123'] && phone_numbers))

The problem I'm having is that with about 1 million records, the query gets really slow. My question is simple, do array columns have different types of indexes?. Does anybody know the best type of index to create in this case? (Assuming there are different types).

Just in case, this is the explain analyze response:

"Seq Scan on business  (cost=0.00..207254.51 rows=1 width=32) (actual time=18850.462..18850.462 rows=0 loops=1)"
"  Filter: (('a'::text = ANY (address_line_1)) AND (('a'::text = ANY (business_name)) OR ('a'::text = ANY (website_url)) OR ('{123}'::text[] && phone_numbers)) AND ((('a'::text = ANY (city)) AND ('a'::text = ANY (state))) OR ('1234'::text = ANY (zip_code))))"
"  Rows Removed by Filter: 900506"
"Total runtime: 18850.523 ms"

Thanks in advance!

解决方案

You can use a GIN index to effectively help performance with arrays.
Use it in combination with array operators.

For instance:

CREATE INDEX business_address_line_1_idx ON business USING GIN (address_line_1);

Do that for all array columns involved in conditions.

It might be worth considering to normalize your schema instead. Maybe splitting up the multiple entries into a separate (1:n or n:m) table would serve you better. It often does in the long run, even if it seems like more work at first.

这篇关于Text []数组列的表索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆