sqlite:获取所有行的最快方法(连续磁盘访问) [英] sqlite: Fastest way to get all rows (consecutive disk access)

查看:27
本文介绍了sqlite:获取所有行的最快方法(连续磁盘访问)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 system.data.sqlite 读取表中的所有行.由于我有一个非常大的表(> 450GB,> 60 亿行),我想确保 sqlite 将使用连续的磁盘访问.您可能知道随机访问硬盘很慢.由于内存限制,我无法一次加载所有数据.所以最好的方法是,如果 sqlite 读取数百 MB(连续),那么我处理这些数据,sqlite 读取下一个.

I want to read all rows in a table using system.data.sqlite. As I have a very big Table (>450GB, with > 6 billion rows) I want to be sure that sqlite will use consequtive disk access. As you may know a random access to hard disk is slow. Due to memory restictions I can not load all the data at once. So the optimal way would be if sqlite reads some hundred MB (consecutive), then I work with this data and sqlite reads the next.

我如何确定sqlite会以这种方式进行磁盘访问,而不是从硬盘上的一个位置跳到另一个位置?

How can I be sure that sqlite will do disk access this way and not jumping from one position on hard disk to another?

我知道的事情(我认为这些建议会出现):

Things I know (I think these suggestions will show up):

  • 最好使用其他 DBMS.但我想/需要用这个来解决它.
  • 我知道,当我处理数据时,操作系统会将磁盘头定位在其他数据上.这没关系.只是会连续读取一些 hundret MB.
  • 我不想/可以将数据库文件分成更小的部分

我找到了这篇文章,但它没有正确解决我的问题:
检索所有项目的最快方式SQLite?

I found this post, but it is not addressing my problem correctly:
Which is the fastest way to retrieve all items in SQLite?

推荐答案

这就是聚集索引的用途.不过,sqlite 不支持它们.

That's what clustered index are for. sqlite doesn't support them though.

以下内容复制自:http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows

四:聚集索引

SQLite 不支持聚集索引(简单地说,索引强制数据库中的数据以相同的顺序物理放置因为索引需要它.)

SQLite doesn't support clustered indexes (simply, indexes that force the data in the database to be physically laid down in the SAME order as the index needs it to be in.)

这意味着如果您的索引是连续整数,则记录是以 INTEGERs 的顺序在数据库中物理布局,1 然后 2然后 3.

This means that if your index is sequential INTEGER, the records are physically laid out in the database in that INTEGERs order, 1 then 2 then 3.

您不能创建聚集索引,但您可以按顺序对数据进行排序这样任何历史数据都可以很好地排序.当然,作为数据库成熟了,你失去了它,但它有帮助

You can't make a Clustered index, but you CAN sort your data in order so that any historical data is ordered nicely. Of course, as the database matures, you lose that, but it helps

其他人发布了这个,这是一个很好的例子,所以我会.如果你有一个表 WIBBLE,你想访问它的字段 KEY,如果一切顺利就好了.使用命令行工具,您可以通过执行以下操作来创建一个假集群:

Someone else posted this, and it is a nice example to use, so I will. If you have a table WIBBLE whose field KEY you want to access a lot, it would be nice if everything was in order. Using the command line tool, you can create a fake cluster by doing the following:

create table wibble2 as select * from wibble;
delete from wibble;
insert into wibble select * from wibble2 order by key;
drop table wibble2;

最重要的是您可以手动重新排序您的记录,但我想这对您来说只有在您不打算经常写入表格时才实用.

Bottom line is you can re-order your records manually, but I imagine this would be practical for you only if you don't intend to write to the table to often.

这篇关于sqlite:获取所有行的最快方法(连续磁盘访问)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆