优化InnoDB插入查询 [英] Optimizing InnoDB Insert Queries

查看:96
本文介绍了优化InnoDB插入查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据慢查询日志,以下查询(和类似查询)大约需要2秒钟才能执行:

According to slow query log, the following query (and similar queries) would take around 2s to execute occassionally:

INSERT INTO incoming_gprs_data (data,type) VALUES ('3782379837891273|890128398120983891823881abcabc','GT100');

表结构:

CREATE TABLE `incoming_gprs_data` (
 `id` int(200) NOT NULL AUTO_INCREMENT,
 `dt` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `data` text NOT NULL,
 `type` char(10) NOT NULL,
 `test_udp_id` int(20) NOT NULL,
 `parse_result` text NOT NULL,
 `completed` tinyint(1) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `completed` (`completed`)
) ENGINE=InnoDB AUTO_INCREMENT=5478246 DEFAULT CHARSET=latin1

与此表相关的活动:


  1. 每秒大约有200行插入到该表中。传入的数据来自不同的来源(因此,它不会在一个进程中发生,而是每秒处理多个)。

  2. cron进程将通过通过<获取行来处理这些行。 code> SELECT *从传入的gprs_data中完成= 0 ,对其进行处理,并更新 completed = 1

  3. 另一个cron进程(每15分钟运行一次)将删除已完成的行(即 completed = 1 )以使表更苗条。

  4. 慢速日志查询并不表示与表相关的任何慢速 SELECT 查询。

  5. 表的大小相对较小少于20万行。

  1. Around 200 rows are inserted to this table every second. The incoming data is originating from different sources (thus, it does not happen in one process but multiple processed at every second).
  2. A cron process will process these rows by getting the rows via SELECT * FROM incoming_gprs_data WHERE completed = 0, process them, and update completed = 1
  3. Another cron process (runs every 15 minutes) will delete the completed rows (i.e. completed = 1) to make the table slimmer.
  4. Slow log query does not indicate any slow SELECT query related to the table.
  5. The size of the table is relatively small less than 200K rows.

我们之所以执行#2和#3是因为以前,我们发现删除已完成的行需要时间因为索引需要重建。因此,我们添加了 completed 标志,并减少了删除频率。这些更改有助于减少慢速查询的次数。

The reason we are doing #2 and #3 because previously, we have discovered that deleting completed row took time because the index needs to be rebuilt. Therefore, we added the completed flag and perform the deletion less frequently. These changes help to reduce the number of slow queries.

以下是我们拥有的innodb_settings:

Here are the innodb_settings that we have:

+---------------------------------+------------------------+
| Variable_name                   | Value                  |
+---------------------------------+------------------------+
| have_innodb                     | YES                    |
| ignore_builtin_innodb           | OFF                    |
| innodb_adaptive_flushing        | ON                     |
| innodb_adaptive_hash_index      | ON                     |
| innodb_additional_mem_pool_size | 8388608                |
| innodb_autoextend_increment     | 8                      |
| innodb_autoinc_lock_mode        | 1                      |
| innodb_buffer_pool_instances    | 2                      |
| innodb_buffer_pool_size         | 6442450944             |
| innodb_change_buffering         | all                    |
| innodb_checksums                | ON                     |
| innodb_commit_concurrency       | 0                      |
| innodb_concurrency_tickets      | 500                    |
| innodb_data_file_path           | ibdata1:10M:autoextend |
| innodb_data_home_dir            |                        |
| innodb_doublewrite              | OFF                    |
| innodb_fast_shutdown            | 1                      |
| innodb_file_format              | Antelope               |
| innodb_file_format_check        | ON                     |
| innodb_file_format_max          | Antelope               |
| innodb_file_per_table           | ON                     |
| innodb_flush_log_at_trx_commit  | 2                      |
| innodb_flush_method             | O_DIRECT               |
| innodb_force_load_corrupted     | OFF                    |
| innodb_force_recovery           | 0                      |
| innodb_io_capacity              | 200                    |
| innodb_large_prefix             | OFF                    |
| innodb_lock_wait_timeout        | 50                     |
| innodb_locks_unsafe_for_binlog  | OFF                    |
| innodb_log_buffer_size          | 67108864               |
| innodb_log_file_size            | 536870912              |
| innodb_log_files_in_group       | 2                      |
| innodb_log_group_home_dir       | ./                     |
| innodb_max_dirty_pages_pct      | 75                     |
| innodb_max_purge_lag            | 0                      |
| innodb_mirrored_log_groups      | 1                      |
| innodb_old_blocks_pct           | 37                     |
| innodb_old_blocks_time          | 0                      |
| innodb_open_files               | 300                    |
| innodb_purge_batch_size         | 20                     |
| innodb_purge_threads            | 0                      |
| innodb_random_read_ahead        | OFF                    |
| innodb_read_ahead_threshold     | 56                     |
| innodb_read_io_threads          | 4                      |
| innodb_replication_delay        | 0                      |
| innodb_rollback_on_timeout      | OFF                    |
| innodb_rollback_segments        | 128                    |
| innodb_spin_wait_delay          | 6                      |
| innodb_stats_method             | nulls_equal            |
| innodb_stats_on_metadata        | OFF                    |
| innodb_stats_sample_pages       | 8                      |
| innodb_strict_mode              | OFF                    |
| innodb_support_xa               | ON                     |
| innodb_sync_spin_loops          | 30                     |
| innodb_table_locks              | ON                     |
| innodb_thread_concurrency       | 0                      |
| innodb_thread_sleep_delay       | 10000                  |
| innodb_use_native_aio           | OFF                    |
| innodb_use_sys_malloc           | ON                     |
| innodb_version                  | 1.1.8                  |
| innodb_write_io_threads         | 4                      |
+---------------------------------+------------------------+

我们已将<$ c使用以下SQL查询计算后,将$ c> innodb_buffer_pool_size 转换为 6G

SELECT CEILING(Total_InnoDB_Bytes*1.6/POWER(1024,3)) RIBPS FROM (SELECT SUM(data_length+index_length) Total_InnoDB_Bytes FROM information_schema.tables WHERE engine='InnoDB') A;

它会生成 5GB 的结果。我们估计InnoDB表的大小不会超过此大小。

And it generates the result of 5GB. We estimated that it won't exceed this size for our InnoDB tables.

目前,我们主要关注的是如何加快在表中插入查询,是什么引起偶尔的缓慢插入查询。

Our primary concern right at the moment is on how to speed up the insert query into the table and what causes the occasional slow insert queries.

推荐答案

如您所知,每秒插入200行非常多。尝试在如此规模的应用程序上优化此数据流是值得您麻烦的。

As you know, 200 rows a second of insertion is a lot. It is worth your trouble to try to optimize this data flow on an application of this scale.

InnoDB使用所有插入的数据库事务。也就是说,每个插入都是这样的:

InnoDB uses database transactions on all insertions. That is, every insert looks like this:

 START TRANSACTION;
 INSERT something...;
 COMMIT;

如果不指定这些事务,则会出现自动提交行为。

If you don't specify these transactions, you get autocommit behavior.

大批量插入的秘诀是在每笔交易中进行很多插入,例如:

The secret to doing insertions at high volume is to do many of them in each transaction, like so:

 START TRANSACTION;
 INSERT something...;
 INSERT something...;
 ...
 INSERT something...;
 INSERT something...;
 INSERT something...;
 COMMIT;
 START TRANSACTION;
 INSERT something...;
 INSERT something...;
 ...
 INSERT something...;
 INSERT something...;
 INSERT something...;
 COMMIT;
 START TRANSACTION;
 INSERT something...;
 INSERT something...;
 ...
 INSERT something...;
 INSERT something...;
 INSERT something...;
 COMMIT;

我成功完成了多达一百个 INSERT 在每次 COMMIT之前的命令;

I have had good success with up to one hundred INSERT commands before each COMMIT;

不要忘记最后一个提交不要问我怎么知道这个建议。 :-)

Do not forget the final COMMIT! Don't ask me how I know to give this advice. :-)

在MySQL中执行此操作的另一种方法是使用多行 INSERT 命令看起来像这样。

Another way to do this in MySQL is with multiple-row INSERT commands In your case they might look like this.

INSERT INTO incoming_gprs_data (data,type) VALUES
    ('3782379837891273|890128398120983891823881abcabc','GT100'),
    ('3782379837891273|890128398120983891823881abcabd','GT101'),
    ('3782379837891273|890128398120983891823881abcabe','GT102'),
       ...
    ('3782379837891273|890128398120983891823881abcabf','GT103'),
    ('3782379837891273|890128398120983891823881abcac0','GT104');

获得最高插入率的第三种方法(最困难,性能最高的方法)是:将一批数据存储在文本文件中,然后使用 LOAD DATA INFILE 命令将数据放入表中。这种技术确实可以非常快速,特别是如果可以直接从MySQL服务器的文件系统加载文件的话。

A third way, the hardest and the highest performance way, to get a very high insert rate is to store your batches of data in text files, and then use the LOAD DATA INFILE command to put the data into your table. This technique can be very fast indeed, especially if the file can be loaded directly from the file system of your MySQL server.

我建议您先尝试事务处理,看看是否获得所需的性能。

I suggest you try the transaction stuff first to see if you get the performance you need.

另一件事:如果您白天或晚上都比较安静,则可以删除已完成的行,而不是每隔15分钟删除一次。无论如何,当您读回这些行以进行处理或删除时,应使用如下的事务批处理过程:

Another thing: if you have a quiet time of day or night, you can delete the completed rows then, rather than every fifteen minutes. In any case, when you read back these rows to process or to delete, you should use a transaction-batch process like this:

   done = false   /* pseudocode for your programming language */
   while not done {
       DELETE FROM table WHERE completed = 1 LIMIT 50;
       if that query handled zero rows {
           done = true
       }
   }

这将以合理大小的事务批处理您的删除操作。偶尔的两秒钟的插入延迟可能是由于处理或删除上的事务性批处理数量很大所致。

This will do your deletion operation in reasonably sized transactional batches. Your occasional two-second insertion delay is probably a result of a very large transactional batch on your processing or deletion.

这篇关于优化InnoDB插入查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆