PostgreSQL表索引 [英] PostgreSQL table indexing
问题描述
我想为以下查询建立表索引:
I want to index my tables for the following query:
select
t.*
from main_transaction t
left join main_profile profile on profile.id = t.profile_id
left join main_customer customer on (customer.id = profile.user_id)
where
(upper(t.request_no) like upper(('%'||@requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||@phoneNumber||,'%')))
and t.service_type = 'SERVICE_1'
and t.status = 'SUCCESS'
and t.mode = 'AUTO'
and t.transaction_type = 'WITHDRAW'
and customer.client = 'corp'
and t.pub_date>='2018-09-05' and t.pub_date<='2018-11-05'
order by t.pub_date desc, t.id asc
LIMIT 1000;
这是我尝试索引表的方式:
This is how I tried to index my tables:
CREATE INDEX main_transaction_pr_id ON main_transaction (profile_id);
CREATE INDEX main_profile_user_id ON main_profile (user_id);
CREATE INDEX main_customer_client ON main_customer (client);
CREATE INDEX main_transaction_gin_req_no ON main_transaction USING gin (upper(request_no) gin_trgm_ops);
CREATE INDEX main_customer_gin_phone ON main_customer USING gin (upper(phone) gin_trgm_ops);
CREATE INDEX main_transaction_general ON main_transaction (service_type, status, mode, transaction_type); --> don't know if this one is true!!
像上面一样建立索引之后,我的查询只花了1000秒就花了4.5秒以上!
After indexing like above my query is spending over 4.5 seconds for just selecting 1000 rows!
我从下表中进行选择,该表具有34列,其中包括3个FOREIGN KEY,并且其数据行超过300万:
I am selecting from the following table which has 34 columns including 3 FOREIGN KEYs and it has over 3 million data rows:
CREATE TABLE main_transaction (
id integer NOT NULL DEFAULT nextval('main_transaction_id_seq'::regclass),
description character varying(255) NOT NULL,
request_no character varying(18),
account character varying(50),
service_type character varying(50),
pub_date" timestamptz(6) NOT NULL,
"service_id" varchar(50) COLLATE "pg_catalog"."default",
....
);
我还加入了两个表( main_profile
, main_customer
),用于搜索 customer.phone
和选择 customer.client
.要从 main_transaction
表访问 main_customer
表,我只能按 main_profile
I am also joining two tables (main_profile
, main_customer
) for searching customer.phone
and for selecting customer.client
. To get to the main_customer
table from main_transaction
table, I can only go by main_profile
我的问题是如何为我的表建立索引以提高上述查询的性能?
My question is how can I index my table too increase performance for above query?
在这种情况下,请不要将 UNION
用于 OR
(upper(t.request_no)像upper(('%'|| @@ requestNumber ||'%'))或OR upper(c.phone)像upper(concat('%',|| @phoneNumber ||,'%')))
代替,我们可以使用 case when
条件?因为,我必须将PostgreSQL查询转换为Hibernate JPA!而且我不知道如何转换 UNION
,除了不允许使用的 Hibernate-Native SQL
.
Please, do not use UNION
for OR
for this case (upper(t.request_no) like upper(('%'||@requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||@phoneNumber||,'%')))
instead can we use case when
condition? Because, I have to convert my PostgreSQL query into Hibernate JPA! And I don't know how to convert UNION
except Hibernate - Native SQL
which I am not allowed to use.
说明:
Limit (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.381 rows=1 loops=1)
-> Sort (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.380 rows=1 loops=1)
Sort Key: t.pub_date DESC, t.id
Sort Method: quicksort Memory: 27kB
-> Hash Join (cost=20817.10..411600.73 rows=38 width=1906) (actual time=3214.473..3885.369 rows=1 loops=1)
Hash Cond: (t.profile_id = profile.id)
Join Filter: ((upper((t.request_no)::text) ~~ '%20181104-2158-2723948%'::text) OR (upper((customer.phone)::text) ~~ '%20181104-2158-2723948%'::text))
Rows Removed by Join Filter: 593118
-> Seq Scan on main_transaction t (cost=0.00..288212.28 rows=205572 width=1906) (actual time=0.068..1527.677 rows=593119 loops=1)
Filter: ((pub_date >= '2016-09-05 00:00:00+05'::timestamp with time zone) AND (pub_date <= '2018-11-05 00:00:00+05'::timestamp with time zone) AND ((service_type)::text = 'SERVICE_1'::text) AND ((status)::text = 'SUCCESS'::text) AND ((mode)::text = 'AUTO'::text) AND ((transaction_type)::text = 'WITHDRAW'::text))
Rows Removed by Filter: 2132732
-> Hash (cost=17670.80..17670.80 rows=180984 width=16) (actual time=211.211..211.211 rows=181516 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 3166kB
-> Hash Join (cost=6936.09..17670.80 rows=180984 width=16) (actual time=46.846..183.689 rows=181516 loops=1)
Hash Cond: (customer.id = profile.user_id)
-> Seq Scan on main_customer customer (cost=0.00..5699.73 rows=181106 width=16) (actual time=0.013..40.866 rows=181618 loops=1)
Filter: ((client)::text = 'corp'::text)
Rows Removed by Filter: 16920
-> Hash (cost=3680.04..3680.04 rows=198404 width=8) (actual time=46.087..46.087 rows=198404 loops=1)
Buckets: 131072 Batches: 4 Memory Usage: 2966kB
-> Seq Scan on main_profile profile (cost=0.00..3680.04 rows=198404 width=8) (actual time=0.008..20.099 rows=198404 loops=1)
Planning time: 0.757 ms
Execution time: 3885.680 ms
推荐答案
由于不使用 UNION
的限制,您将没有一个好的计划.
With the restriction to not use UNION
, you won't get a good plan.
您可以使用以下索引稍微加快处理速度:
You can slightly speed up processing with the following indexes:
main_transaction ((service_type::text), (status::text), (mode::text),
(transaction_type::text), pub_date)
main_customer ((client::text))
这些至少应该摆脱顺序扫描,但是将保留占处理时间最多的散列连接.
These should at least get rid of the sequential scans, but the hash join that takes the lion's share of the processing time will remain.
这篇关于PostgreSQL表索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!