SQLite的:有效的方式放弃大量的行 [英] SQLite: efficient way to drop lots of rows

查看:166
本文介绍了SQLite的:有效的方式放弃大量的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SQLite的,Android的,真实的故事。我有一个表,这是我作为缓存使用:

SQlite, Android, true story. I have a table, which I use as a cache:

CREATE TABLE cache(key TEXT, ts TIMESTAMP, size INTEGER, data BLOB);
CREATE UNIQUE INDEX by_key ON cache(key);
CREATE INDEX by_ts ON cache(ts);

在应用程序有生之年,我填充缓存,在某些时候我想清楚它拖放 N 记录。通常,该表将包含〜25000斑点〜在DB 100-500Kb每次,总斑点大小是600-800Mb,但现在我测试〜2000年这是约60MB(下面的数字是这种情况)。清除去除90%的高速缓存条目。

During app lifetime I fill the cache and at some point I want to clear it out and drop N records. Typically this table will contain ~25000 blobs ~100-500Kb each, total blobs size in the DB is 600-800Mb, but now I test for ~2000 which are about 60Mb (following numbers are for this case). Clear removes 90% of cache entries.

我尝试不同的方式来做到这一点,在这里简要说明:

[1] 最差,最简单的。首先选择,比删除一个接一个地走光标。非常慢。

[1] Worst and simplest. First select, than remove one by one, walking cursor. Terribly slow.

[2] 请SQLite的与查询做到这一点(删除斑点与他们完全 N 字节):

[2] Make SQLite to do it with query (delete blobs with totally N bytes in them):

DELETE FROM blobs WHERE
  ROWID IN (SELECT ROWID FROM blobs WHERE 
             (SELECT SUM(size) FROM blobs AS _ WHERE ts <= blobs.ts) <= N);

这是快,但仍然非常缓慢:〜15秒。似乎也是它喜欢它有二次的复杂性。

This is faster, but still terribly slow: ~15 sec. Seems also it like it has quadratic complexity.

[3] 在周围为删除(使用平均BLOB大小计算),并用简单的,其中条款删除选择行:

[3] Select row around where to remove (using average blob size for computations) and delete with simple WHERE clause:

-- Find row after which to delete, let it's time stamp is T0:
SELECT ts FROM cache ORDER BY ts LIMIT 1 OFFSET count;
-- Delete
DELETE FROM cache WHERE ts < T0;

这是好多了,但需要〜7秒。

This is much better, but takes ~7 sec.

[4] 创建新表,复制我需要保存和删除旧的。请注意,我在新表中创建索引后我抄这些东西:

[4] Create new table, copy that I need to save and drop old one. Note, that I create index in the new table AFTER I copied all this stuff:

  -- Insert only rows I want leave
  INSERT INTO temp(key, ts, size, data) SELECT key, ts, size, data 
    FROM cache ORDER BY ts LIMIT count;
  -- Drop table and indices.
  DROP INDEX by_key;
  DROP INDEX by_ts;
  DROP TABLE cache;
  -- Rename temp table and create indices...

复制大约需要300毫秒为6MB的的斑点。但 DROP TABLE 约为〜8秒。

请注意在所有的情况下,我做的真空这需要另一个〜1秒。我怎样才能使它快?为什么 DROP TABLE 和删除都这么慢?我想可能是因为指标:当我做出退学关键指标前删除它的工作速度更快。如何让SQLite的删除快捷?

Note in all cases I do VACUUM which takes another ~1 sec. How can I make it fast? Why DROP TABLE and deletion are so slow? I think it might be because of indices: when I dropped key index before DELETE it worked faster. How to make SQLite delete fast?

推荐答案

您正在以大数据的数据库 - 也就是说,每一个斑点使用多页

You are working on a database with "big" data - ie, each blob using multiple pages.

目前接近最优的性能有一些时候你会达到一个极限,你能不能改善。

At some point near optimal performance you will reach a limit you can't improve.

检查所有的选择,我看到不同的行为,而不仅仅是不同的算法。

Checking all your choices, I see different behaviors, not just different algorithms.

[1]这不应该是可怕的慢,只要你使用事务。你需要两个操作一次,查询(获得BLOB大小),并删除。

[1] This one shouldn't be terrible slow as long as you use a transaction. You need two operations at once, query (to get blob size) and delete.

[2]这是一个不错的办法。作为两个查询和删除,都在一个单一的命令,所以SQLite的发动机优化。

[2] This is a good approach. As two queries and a delete, all in a single command, so SQLite engine will optimize.

[3]这是从之前的所有不同的行为。同 DELETE FROM缓存为Ts&LT; (TS选择从缓存中的ORDER BY TS LIMIT 1 OFFSET计)。查询更便宜那么previous,但我敢打赌,删除的行数都远远低于previous之一!查询昂贵的部分/删除将删除!查询优化是重要的,但事情总是会得到在删除慢。

[3] This is a different behaviour from all before. Same as DELETE FROM cache WHERE ts < (SELECT ts FROM cache ORDER BY ts LIMIT 1 OFFSET count). Query is less expensive then previous, but I bet number of rows deleted are far less then previous one! Expensive part of query/delete will be delete! Query optimization is important, but things will always get slower in delete.

[4]这是一个非常糟糕的做法!复制所有数据到新表 - 也许另一个数据库 - 将会非常昂贵。我只得到一个优势,从这样的:你可以将数据复制到一个新的数据库,避免真空,为新的数据库是构建基础,它的清洁

[4] This is a very bad approach!!! Copying all your data to a new table - maybe another database - will be VERY expensive. I only get one advantage from this: you may copy data to a new database and avoid VACUUM, as new database was build from base and it's clean.

关于真空 ...最差的则删除真空。真空是不应该被经常使用在数据库中。我理解这个算法应该是干净的数据库,但清洁不应该是一个频繁操作 - 数据库的选择/插入优化/删除/更新 - 而不是将所有的数据在最小尺寸

About VACUUM... Worst then DELETE is VACUUM. Vacuum is not supposed to be used often in a database. I understand this algorithm is supposed to "clean" your database, but cleaning shouldn't be a frequent operation - databases are optimized for select/insert/delete/update - not to keep all data at a minimal size.

我的选择是使用删除... IN(SELECT ...)单人操作,根据predefined标准。 真空将不会被使用,至少不那么频繁。一个不错的选择将显示器分贝的大小 - 当这种规模碾过的限制,运行假设昂贵清洗,修剪数据库

My choice would be using a DELETE ... IN (SELECT ...) single operation, according to predefined criteria. VACUUM wouldn't be used, at least not so often. One good choice would be monitor db size - when this size run over a limit, run a assumed expensive cleaning to trim database.

最后,使用多个命令时,不要忘记用交易!

At last, when using multiple commands, never forget to use transactions!

这篇关于SQLite的:有效的方式放弃大量的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆