MySQL优化INSERT速度因索引而减慢 [英] MySQL optimizing INSERT speed being slowed down because of indices

查看:1255
本文介绍了MySQL优化INSERT速度因索引而减慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

MySQL文档说:

假设B树索引,表的大小会减慢日志N的插入速度。

The size of the table slows down the insertion of indexes by log N, assuming B-tree indexes.

这是否意味着插入每个新行时,插入速度将减慢因子log N,其中N,我假设是行数?即使我只在一个查询中插入所有行? ie:

Does this mean that for insertion of each new row, the insertion speed will be slowed down by a factor of log N where N, I assume is number of rows? even if I insert all rows in just one query? i.e. :

INSERT INTO mytable VALUES (1,1,1), (2,2,2),  (3,3,3), .... ,(n,n,n)

其中n是~70,000

Where n is ~70,000

我目前在表中有大约1.47百万行,结构如下:

I currently have ~1.47 million rows in a table with the following structure :

CREATE TABLE mytable (
   `id` INT,
   `value` MEDIUMINT(5),
   `date` DATE,
   PRIMARY_KEY(`id`,`date`)
) ENGINE = InnoDB

当我在交易中以上述方式插入时,提交时间约为275秒。我如何优化这一点,因为每天都要添加新数据,插入时间将继续减慢。

When I insert in the above mentioned fashion in a transaction, the commit time taken is ~275 seconds. How can I optimize this, since new data is to be added everyday and the insert time will just keep on slowing down.

此外,除了查询之外还有什么可能有帮助?可能是一些配置设置?

Also, is there anything apart from just queries that might help? maybe some configuration settings?

我读到在插入之前删除索引可能帮助插入速度。插入后,我再次添加索引。但是这里唯一的索引是主键,在我看来放弃它对我们来说没什么用。此外,虽然主键已删除,但所有选择查询都会变慢。

I read that removing indices just before insert might help insert speed. And after inserts, I add the index again. But here the only index is primary key, and dropping it won't help much in my opinion. Also, while the primary key is dropped , all the select queries will be crippling slow.

我不知道其他任何查询可能的方法。

编辑:以下是一些测试,在表格中插入约~60,000行,行数约为1.47 mil:

Edit : Here are a few tests on inserting ~60,000 rows in the table with ~1.47 mil rows:

使用上述简单查询: 146秒

使用MySQL的LOAD DATA infile: 145秒

使用MySQL的LOAD DATA infile并按照David Jashi的回答拆分csv文件: 136秒,每个1000行,6个文件136秒,每行10,000行

Using MySQL's LOAD DATA infile and splitting the csv files as suggested by David Jashi in his answer: 136 seconds for 60 files with 1000 rows each, 136 seconds for 6 files with 10,000 rows each

删除并重新添加主键:密钥删除需要11秒,插入数据需要0.8秒,重新添加主键需要153秒,完全需要~165秒

Removing and re-adding primary key : key removal took 11 seconds, 0.8 seconds for inserting data BUT 153 seconds for re-adding primary key, totally taking ~165 seconds

推荐答案

如果你想快速插入,首先你需要的是适当的硬件。这假定有足够的RAM,SSD而不是机械驱动器和相当强大的CPU。

If you want fast inserts, first thing you need is proper hardware. That assumes sufficient amount of RAM, an SSD instead of mechanical drives and rather powerful CPU.

由于你使用InnoDB,你想要的是优化它,因为默认配置是设计的适用于慢速和旧机器。

Since you use InnoDB, what you want is to optimize it since default config is designed for slow and old machines.

这是一个很好的阅读关于配置InnoDB

之后,您需要了解一件事 - 这就是数据库在内部执行操作的方式,硬盘驱动器的工作方式等等。我将在以下描述中简化机制:

After that, you need to know one thing - and that's how databases do their stuff internally, how hard drives work and so on. I'll simplify the mechanism in the following description:

一个事务是MySQL等待硬盘确认它写入数据。这就是机械驱动器上交易速度慢的原因,它们每秒可以进行200-400次输入输出操作。翻译,这意味着您可以在机械驱动器上使用InnoDB每秒获得200次插入查询。当然,这是简化的解释,只是为了概述正在发生的事情,它不是交易背后的完整机制

A transaction is MySQL waiting for the hard drive to confirm that it wrote the data. That's why transactions are slow on mechanical drives, they can do 200-400 input-output operations per second. Translated, that means you can get 200ish insert queries per second using InnoDB on a mechanical drive. Naturally, this is simplified explanation, just to outline what's happening, it's not the full mechanism behind transaction.

由于查询(尤其是与表大小相对应的查询)在字节方面相对较小 - 您实际上在单个查询上浪费了宝贵的IOPS。

Since a query, especially the one corresponding to size of your table, is relatively small in terms of bytes - you're effectively wasting precious IOPS on a single query.

如果您在一个事务中包装多个查询(100或200或更多,没有确切的数字,您必须测试),然后提交它 - 您将立即每秒实现更多写入。

If you wrap multiple queries (100 or 200 or more, there's no exact number, you have to test) in a single transaction and then commit it - you'll instantly achieve more writes per second.

Percona家伙在相对便宜的硬件上实现了15k插入。即使是5k插入一秒也不错。像你这样的表很小,我已经在类似的表上进行了测试(3列以上),我设法获得10亿条记录没有明显的问题,使用16GB RAM机器和240GB SSD(1个驱动器,没有RAID,用于测试目的)。

Percona guys are achieving 15k inserts a second on a relatively cheap hardware. Even 5k inserts a second isn't bad. The table such as yours is small, I've done tests on a similar table (3 columns more) and I managed to get to 1 billion records without noticeable issues, using 16gb ram machine with a 240GB SSD (1 drive, no RAID, used for testing purposes).

TL; DR: - 按照上面的链接,配置服务器,获取SSD,在1个事务中包含多个插入并获利。并且不要关闭索引然后打开,它总是不适用,因为在某些时候你会花费处理和IO时间来构建它们。

TL;DR: - follow the link above, configure your server, get an SSD, wrap multiple inserts in 1 transactions and profit. And don't turn indexing off and then on, it's not applicable always, because at some point you will spend processing and IO time to build them.

这篇关于MySQL优化INSERT速度因索引而减慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆