具有8千万条记录和添加索引的表需要超过18小时（或永远）！怎么办？ [英] Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

查看：285 发布时间：2017/3/13 23:03:15 mysql database database-design partitioning

本文介绍了具有8千万条记录和添加索引的表需要超过18小时（或永远）！怎么办？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

简要回顾一下发生了什么。我正在处理7100万条记录（与其他人处理的数十亿条记录相比不算多）。在另一个主题中，有人建议我的集群的当前设置不适合我的需要。我的表结构是：

A short recap of what happened. I am working with 71 million records (not much compared to billions of records processed by others). On a different thread, someone suggested that the current setup of my cluster is not suitable for my need. My table structure is:

CREATE TABLE `IPAddresses` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `ipaddress` bigint(20) unsigned default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM;

我添加了7100万条记录，然后做了一个：

And I added the 71 million records and then did a:

ALTER TABLE IPAddresses ADD INDEX(ipaddress);

已经过了14小时，操作仍未完成。在Google上，我发现有一个众所周知的方法来解决这个问题 - 分区。我明白，我需要分区我的表现在基于ipaddress，但我可以这样做而不重新创建整个表？我的意思是，通过ALTER语句？如果是，有一个要求说，要分区的列应该是主键。我将使用这个ipaddress的id在构造一个不同的表，所以ipaddress不是我的主键。在这种情况下，如何分区我的表？

It's been 14 hours and the operation is still not completed. Upon Googling, I found that there is a well-known approach for solving this problem - Partitioning. I understand that I need to partition my table now based on the ipaddress but can I do this without recreating the entire table? I mean, through an ALTER statement? If yes, there was one requirement saying that the column to be partitioned on should be a primary key. I will be using the id of this ipaddress in constructing a different table so ipaddress is not my primary key. How do I partition my table given this scenario?

推荐答案

Ok原来，这个问题不仅仅是一个简单的创建表，索引它和忘记问题:)这里是我做了只是为了防止其他人面临同样的问题（我使用了IP地址的示例，但它也适用于其他数据类型）：

Ok turns out that this problem was more than just a simple create a table, index it and forget problem :) Here's what I did just in case someone else faces the same problem (I have used an example of IP Address but it works for other data types too):

问题：您的表格包含数百万个条目，需要快速添加索引。

Usecase： / strong>考虑在查找表中存储数百万个IP地址。添加IP地址不应该是一个大问题，但是在它们上创建索引需要超过14小时。

Usecase: Consider storing millions of IP addresses in a lookup table. Adding the IP addresses should not be a big problem but creating an index on them takes more than 14 hours.

解决方案：使用 MySQL的Partitionin g策略

案例＃1：

CREATE TABLE IPADDRESSES( id INT UNSIGNED NOT NULL AUTO_INCREMENT, ipaddress BIGINT UNSIGNED, PRIMARY KEY(id, ipaddress) ) ENGINE=MYISAM PARTITION BY HASH(ipaddress) PARTITIONS 20;

案例＃2：当您所需的表格已创建时。
似乎有一种方法使用ALTER TABLE来做到这一点，但我还没有找到一个合适的解决方案。相反，有一个低效的解决方案：

Case #2: When the table you want is already created. There seems to be a way to use ALTER TABLE to do this but I have not yet figured out a proper solution for this. Instead, there is a slightly inefficient solution:

CREATE TABLE IPADDRESSES_TEMP( id INT UNSIGNED NOT NULL AUTO_INCREMENT, ipaddress BIGINT UNSIGNED, PRIMARY KEY(id) ) ENGINE=MYISAM;

将IP地址插入此表。然后用分区创建实际表：

Insert your IP addresses into this table. And then create the actual table with partitions:

CREATE TABLE IPADDRESSES( id INT UNSIGNED NOT NULL AUTO_INCREMENT, ipaddress BIGINT UNSIGNED, PRIMARY KEY(id, ipaddress) ) ENGINE=MYISAM PARTITION BY HASH(ipaddress) PARTITIONS 20;

最后

INSERT INTO IPADDRESSES(ipaddress) SELECT ipaddress FROM IPADDRESSES_TEMP; DROP TABLE IPADDRESSES_TEMP; ALTER TABLE IPADDRESSES ADD INDEX(ipaddress)

新桌子带我约2个小时在一个3.2GHz的机器与1GB RAM :)希望这有助于。

And there you go... indexing on the new table took me about 2 hours on a 3.2GHz machine with 1GB RAM :) Hope this helps.

这篇关于具有8千万条记录和添加索引的表需要超过18小时（或永远）！怎么办？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

具有8千万条记录和添加索引的表需要超过18小时（或永远）！怎么办？ [英] Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

具有8千万条记录和添加索引的表需要超过18小时（或永远）！怎么办？ [英] Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭