PostgreSQL根据时间戳匹配开始时间和结束时间之间的间隔 [英] PostgreSQL matching interval between start and end time against timestamp
问题描述
我正在设计一些系统,该系统将存储包含开始时间和结束时间的记录。例如:
I am designing some system that will store records containing a begin and end time. For example:
CREATE TABLE test (
id bigserial PRIMARY KEY,
ts_start timestamp NOT NULL,
ts_end timestamp NOT NULL,
foo bar NOT NULL,
...
);
现在,我想对此进行查询,以查找与某个时间戳重叠的所有行。这将导致where子句,例如:
Now I want to run queries on this to find all rows that overlap with a certain timestamp. This would result in a where clause like:
WHERE ts_start <= '2006-4-6 12:34:56' AND ts_end > '2006-4-6 12:34:56'
我确实用大量的生成的测试数据,性能相当差。我用ts_start上的索引和ts_end上的另一个索引以及ts_start和ts_end上的多列索引进行了测试。最后一个给出了最好的结果,但仍然远非最佳。
I did test this with a massive amount of generated test data and the performance is quite bad. I tested it with an index on ts_start and another index on ts_end and also with an multi column index on ts_start and ts_end. The last gave the best result but it is still far from optimal.
问题是postgresql不知道ts_end比ts_start更大,因此它使用的计划能够找到ts_end较小的行然后是ts_start。
The problem is that postgresql doesn't know the fact that ts_end is guaranteed to be larger then ts_start so it uses a plan that is capable of finding rows where ts_end is smaller then ts_start.
有人建议如何解决这个问题吗?
Any suggestions how to solve this problem?
编辑:
如果您可以等待更长的时间,也会出现此问题,那么PostgreSQL 9.2就是一个完美的解决方案: range类型。 9.2是beta版本,现在最终版本很有可能在2012年底。
For people having this problem too if you can wait a little longer then PostgreSQL 9.2 has the perfect solution: range types. 9.2 is in beta now final release will most likely be at the end of 2012.
推荐答案
有临时postgres(谷歌它),但我不知道它是否仍在维护...我相信有人在讨论将这种类型的搜索包含在postgres中,但我不记得它的最终状态。无论如何:
There was "temporal postgres" (google it) but I don't know if it is still maintained... I believe there was a discussion of including this type of search into postgres but I don't remember the final state of it. Anyway :
使用box和gist的示例:
Example using box and gist :
CREATE TABLE segments( start INTEGER NOT NULL, stop INTEGER NOT NULL, range_box BOX NOT NULL );
INSERT INTO segments SELECT n,n+1,BOX(POINT(n,-1),POINT(n+1,1)) FROM generate_series( 1, 1000000 ) n;
CREATE INDEX segments_box ON segments USING gist( range_box );
CREATE INDEX segments_start ON segments(start);
CREATE INDEX segments_stop ON segments(stop);
EXPLAIN ANALYZE SELECT * FROM segments WHERE 300000 BETWEEN start AND stop;
Index Scan using segments_start on segments (cost=0.00..12959.24 rows=209597 width=72) (actual time=91.990..91.990 rows=2 loops=1)
Index Cond: (300000 >= start)
Filter: (300000 <= stop)
Total runtime: 92.023 ms
EXPLAIN ANALYZE SELECT * FROM segments WHERE range_box && '(300000,0,300000,0)'::BOX;
Bitmap Heap Scan on segments (cost=283.49..9740.27 rows=5000 width=72) (actual time=0.036..0.037 rows=2 loops=1)
Recheck Cond: (range_box && '(300000,0),(300000,0)'::box)
-> Bitmap Index Scan on segments_box (cost=0.00..282.24 rows=5000 width=0) (actual time=0.032..0.032 rows=2 loops=1)
Index Cond: (range_box && '(300000,0),(300000,0)'::box)
Total runtime: 0.064 ms
正如您所见,这里的要点索引非常快(1500倍!哈哈)
(而且您可以使用许多运算符,例如重叠,包含,包含等)。
As you can see gist index is ridiculously fast here (1500 times ! lol) (and you can use many operators like overlaps, is contained, contains, etc.
http://www.postgresql.org/docs/8.2 /static/functions-geometry.html
这篇关于PostgreSQL根据时间戳匹配开始时间和结束时间之间的间隔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!