大型主键:超过10亿行MySQL + InnoDB? [英] Large primary key: 1+ billion rows MySQL + InnoDB?

查看:90
本文介绍了大型主键:超过10亿行MySQL + InnoDB?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道InnoDB是否是格式化表格的最佳方式?该表包含一个字段,主键,该表每天将获得816k行(est。)。这将变得非常快!我正在研究文件存储方式(这会更快)吗?该表将存储已经处理过的Twitter ID的ID号?

I was wondering if InnoDB would be the best way to format the table? The table contains one field, primary key, and the table will get 816k rows a day (est.). This will get very large very quick! I'm working on a file storage way (would this be faster)? The table is going to store ID numbers of Twitter Ids that have already been processed?

此外, SELECT min(')的任何估计内存使用量id')声明?非常感谢任何其他想法!

Also, any estimated memory usage on a SELECT min('id') statement? Any other ideas are greatly appreciated!

推荐答案

唯一明确的答案是尝试两者并测试并看看会发生什么。

The only definitive answer is to try both and test and see what happens.

通常,MyISAM的写入和读取速度更快,但不能同时进行。当您写入MyISAM表时,整个表都会被锁定以完成插入。 InnoDB有更多的开销但使用行级锁定,因此读取和写入可以同时发生,而不会出现MyISAM的表锁定问题。

Generally, MyISAM is faster for writes and reads, but not both at the same time. When you write to a MyISAM table the entire table gets locked for the insert to complete. InnoDB has more overhead but uses row-level locking so that reads and writes can happen concurrently without the problems that MyISAM's table locking incurs.

但是,如果我理解你的问题它是正确的,有点不同。只有一列,该列作为主键,以MyISAM和InnoDB处理主键索引的不同方式有一个重要的考虑因素。

However, your problem, if I understand it correctly, is a little different. Having only one column, that column being a primary key has an important consideration in the different ways that MyISAM and InnoDB handle primary key indexes.

在MyISAM中,主键索引就像任何其他二级索引一样。在内部,每行都有一个行id,索引节点只指向数据页的行ID。主键索引的处理方式与任何其他索引的处理方式不同。

In MyISAM, the primary key index is just like any other secondary index. Internally each row has a row id and the index nodes just point to the row ids of the data pages. A primary key index is not handled differently than any other index.

但是,在InnoDB中,主键是群集的,这意味着它们保持与数据页的连接并确保行内容按照主键保留在磁盘上的物理排序顺序(但仅在单个数据页中,它们本身可以按任何顺序分散。)

In InnoDB, however, primary keys are clustered, meaning they stay attached to the data pages and ensure that the row contents remain in physically sorted order on disk according to the primary key (but only within single data pages, which themselves could be scattered in any order.)

这是case,我希望InnoDB可能有一个优势,即MyISAM基本上必须做双重工作 - 在数据页中写一次整数,然后在索引页中再次写入。 InnoDB不会这样做,主键索引与数据页面相同,只需要写一次。它只需要在一个地方管理数据,MyISAM将不必要地管理两个副本。

This being the case, I would expect that InnoDB might have an advantage in that MyISAM would essentially have to do double work -- write the integer once in the data pages, and then write it again in the index pages. InnoDB wouldn't do this, the primary key index would be identical to the data pages, and would only have to write once. It would only have to manage the data in one place, where MyISAM would needlessly have to manage two copies.

对于任一存储引擎,执行类似min()或max的操作()在索引列上应该是微不足道的,或者只是检查索引中是否存在数字。由于该表只有一列,因此甚至不需要书签查找,因为数据将完全在索引本身内表示。这应该是一个非常有效的索引。

For either storage engine, doing something like min() or max() should be trivial on an indexed column, or just checking the existence of a number in the index. Since the table is only one column no bookmark lookups would even be necessary as the data would be represented entirely within the index itself. This should be a very efficient index.

我也不会担心表的大小。如果行的宽度只有一个整数,则每个索引/数据页面可以容纳大量行。

I also wouldn't be all that worried about the size of the table. Where the width of a row is only one integer, you can fit a huge number of rows per index/data page.

这篇关于大型主键:超过10亿行MySQL + InnoDB?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆