使用Rtree和普通索引的SQLite查询慢 [英] SQLite query using rtree and normal index slow

查看:279
本文介绍了使用Rtree和普通索引的SQLite查询慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在SQLite表中具有地理空间数据,带有坐标的名称,并为该位置创建了一个rtree,并在名称列上创建了普通索引.

I have geospatial data, names with coordinates, in a SQLite table and created an rtree for the location and a normal index on the name-column.

Rtree根据此文档使用: http://www.sqlite.org/rtree.html

Rtree is used according to this docs: http://www.sqlite.org/rtree.html

当我查询特定区域中的记录时,将使用rtree并且它可以快速运行:

When I query records in a specific area, the rtree is used and it works fast:

SELECT demo_data.* FROM demo_data, demo_index
WHERE demo_data.id=demo_index.id
   AND minX>=-81.0 AND maxX<=-79.6
   AND minY>=35.0 AND maxY>=36.2;

当我只查询名称时,它也很快,因为使用了名称索引:

When I query just for names, it also goes fast, because the name-index is used:

SELECT demo_data.* FROM demo_data
WHERE objname="Test"

但是当我将两者结合在一起时,它的速度非常慢,似乎是在扫描整个表:

But when I combine the two, its very slow, seems like the whole table is scanned:

SELECT demo_data.* FROM demo_data, demo_index
WHERE demo_data.id=demo_index.id
   AND objname="Test"
   AND minX>=-81.0 AND maxX<=-79.6
   AND minY>=35.0 AND maxY>=36.2;

为什么使用两个索引的组合查询这么慢?

Why is this combined query using two indexes so slow?

更新:

经过对EXPLAIN QUERY PLAN的进一步调查,结果表明,每个条件实际上都使用了索引.但是执行组合查询的时间取决于第一个条件下的记录数.该表demo_data具有10mio记录.但只有在第一个条件返回大量记录的情况下,组合速度才很慢.在这种情况下,大约有objname ="Test"的1000条记录,合并的查询需要4秒钟.组合查询objname ="Test12345"(仅存在一次)非常快,只有10毫秒

After more investigation with EXPLAIN QUERY PLAN, it turned out, that the indexes are actually used by each individual condition. But the time for the execution of the combined query depends on the count of records in the first condition. This table demo_data has 10mio records. but the combination is only slow, if the first condition returns a lot of records. In this case, there is some 1000 records with objname="Test" and the combined query takes 4 seconds. A combined query for objname="Test12345", which only exist once, is very fast, only 10ms

推荐答案

涉及多个索引的查询很难加速,并且可能需要统计信息,甚至可能来自较早的查询.

Queries involving more than one index are hard to accelerate, and may require statistics, maybe even from earlier queries.

数据库可以(我不知道SQLite实现了什么),例如:

The database could (I don't know what SQLite implemented) e.g.:

  1. 纾困,并始终进行慢速扫描
  2. 仅使用第一个索引,扫描匹配的行
  3. 仅使用第二个索引,扫描匹配的行
  4. 从两个索引中获取ID,相交,然后重建所有行

选择最佳策略是查询优化器的任务.如果我们可以预测哪个指数产生较小的结果,通常2或3最好.它需要调用ANALYZE时获得的统计信息.

It's the task of the query optimizer to choose the best strategy. Often 2 or 3 is best, if we can predict which index yields the smaller result. It requires statistics obtained when calling ANALYZE.

使用EXPLAIN QUERY PLAN SELECT ...查看SQLite决定做什么: https://sqlite.org/eqp.html

Use EXPLAIN QUERY PLAN SELECT ... to see what SQLite decided to do: https://sqlite.org/eqp.html

您还可以尝试使用嵌套查询将优化器推向更好的计划,并且应该阅读查询优化器文档: https://www.sqlite.org/optoverview.html

You can also try nested queries to nudge the optimizer towards the better plan, and you should read the query optimizer documentation: https://www.sqlite.org/optoverview.html

这篇关于使用Rtree和普通索引的SQLite查询慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆