ST_DWithin有时不使用索引 [英] ST_DWithin sometimes doesn't use index
问题描述
我将PostGIS与Postgresql结合使用,以便能够通过存储在位置列Geometry/Point SRID: 4326
中的坐标在某个半径范围内定位条目.这是我正在尝试的两个查询:
I'm using PostGIS with Postgresql in order to be able to locate entries within some radius by coordinates stored in location column Geometry/Point SRID: 4326
. Here are two queries that I'm experimenting with:
第一个距离以米为单位并且use_spheroid = true
First one with distance in meters and use_spheroid=true
EXPLAIN ANALYZE SELECT count(*) FROM "cars" WHERE ST_DWithin(location, ST_SetSRID(ST_MakePoint(20, -30), 4326), 105000, true) LIMIT 1000;
QUERY PLAN
--------------
Limit (cost=11884.28..11884.30 rows=1 width=8) (actual time=18.843..18.844 rows=1 loops=1)
-> Aggregate (cost=11884.28..11884.30 rows=1 width=8) (actual time=18.842..18.843 rows=1 loops=1)
-> Seq Scan on cars (cost=0.00..11883.33 rows=381 width=0) (actual time=0.486..18.827 rows=38 loops=1)
Filter: (((location)::geography && '0101000020E610000000000000000034400000000000003EC0'::geography) AND ('0101000020E610000000000000000034400000000000003EC0'::geography && _st_expand((location)::geography, '105000'::double precision)) AND _st_dwithin((location)::geography, '0101000020E610000000000000000034400000000000003EC0'::geography, '105000'::double precision, true))
Rows Removed by Filter: 28549
Planning time: 0.166 ms
Execution time: 18.878 ms
(7 rows)
第二,我假设接受距离(以度为单位),默认情况下use_spheroid为false.更正:原来这仍然使用use_spheroid = true,但是与该调用匹配的函数签名需要几何和SRID单位,即4326的度数.
Second, I assume, accepts distance in degrees and use_spheroid is false by default. CORRECTION: Turned out this still uses use_spheroid=true, but function signature matching this call expects geometries and SRID units, which is degrees for 4326.
EXPLAIN ANALYZE SELECT count(*) FROM "cars" WHERE ST_DWithin(location, ST_SetSRID(ST_MakePoint(20, -30), 4326), 1) LIMIT 1000;
QUERY PLAN
-----------------------------
Limit (cost=145.30..145.31 rows=1 width=8) (actual time=0.154..0.155 rows=1 loops=1)
-> Aggregate (cost=145.30..145.31 rows=1 width=8) (actual time=0.154..0.154 rows=1 loops=1)
-> Bitmap Heap Scan on cars (cost=4.59..145.29 rows=3 width=0) (actual time=0.050..0.147 rows=37 loops=1)
Recheck Cond: (location && '0103000020E6100000010000000500000000000000000033400000000000003FC000000000000033400000000000003DC000000000000035400000000000003DC000000000000035400000000000003FC000000000000033400000000000003FC0'::geometry)
Filter: (('0101000020E610000000000000000034400000000000003EC0'::geometry && st_expand(location, '1'::double precision)) AND _st_dwithin(location, '0101000020E610000000000000000034400000000000003EC0'::geometry, '1'::double precision))
Rows Removed by Filter: 11
Heap Blocks: exact=47
-> Bitmap Index Scan on cars_location_index (cost=0.00..4.59 rows=42 width=0) (actual time=0.037..0.037 rows=48 loops=1)
Index Cond: (location && '0103000020E6100000010000000500000000000000000033400000000000003FC000000000000033400000000000003DC000000000000035400000000000003DC000000000000035400000000000003FC000000000000033400000000000003FC0'::geometry)
Planning time: 0.280 ms
Execution time: 0.188 ms
(11 rows)
两个查询都返回相似的结果(+/-是因为精度).但是第一个的运行速度要慢100倍.同样将use_spheroid
设置为false不能保证使用索引,当距离太小(<0.4)或太大(> 45)时,它都会退回到Seq Scan.这是应该的样子还是我做错了什么?
Both queries return similar results (+/- because of precision). However first one runs 100 times slower. Also setting use_spheroid
to false doesn't guarantee using of index, it falls back to Seq Scan either when the distance to small (<0.4) or too big (>45). Is this how it's supposed to be or am I doing something wrong?
添加:经过更多的实验后,我将列类型更改为Geography.Point,现在它始终使用索引.问题似乎已经解决,但我仍然对我在几何"类型中观察到的行为感到困惑.
ADDITION: After some more experiments I changed column types to Geography.Point and now it always uses index. Problem seems to be solved, but I'm still confused with that behavior that I observed with Geometry type.
推荐答案
ST_DWithin 文档指出第一个功能签名接受的是地理类型,而不是几何类型:
The ST_DWithin documentation states that the first function signature accepts geography types not geometry types :
boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters, boolean use_spheroid);
由于(location,ST_SetSRID(ST_MakePoint(20,-30),4326))都是几何图形,因此该函数的执行混乱是有道理的. 而且我认为您的第二个功能正常运行,因为您正在执行的是此签名:
since (location,ST_SetSRID(ST_MakePoint(20, -30), 4326)) are all geometries it would make sense that the execution of the function is messy. And i think your second function was working properly because what you were executing was this signature:
boolean ST_DWithin(geometry g1, geometry g2, double precision distance_of_srid);
正如您所说,将列类型切换为Geography而不是Geometry将解决问题,因为这将使您正确执行:
And as you stated that switching the column types to Geography instead of Geometry would solve the issue as that would get you the correct execution of :
boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters);
boolean ST_DWithin(geography gg1, geography gg2, double precision distance_meters, boolean use_spheroid);
希望这会有所帮助.
在文档中找到了这一部分,该文件指出条目
Found this part in the documentation that states that upon data entry
如果标准几何类型数据是SRID 4326,则会自动广播到地理
Standard geometry type data will autocast to geography if it is of SRID 4326
这可以解释为什么Postgres接受了您对ST_DWithin()的第一次调用,因为postgis显然会将其转换为地理位置,并且还解释了为什么执行时间更长并且忽略了索引,因为每次转换都会产生一个新对象,即没有在原始列中建立索引.
this could explain why Postgres accepted your first invocation of the ST_DWithin() as postgis apparently would cast it to geography, and that also explains why the execution takes longer and neglects the index as each casting would result in a new object that is not indexed in your original column.
这篇关于ST_DWithin有时不使用索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!