Mariadb - 每批扩展插入执行时间都在逐渐增加 [英] Mariadb - Every batch of extended insert execution time is increasing gradually

查看:33
本文介绍了Mariadb - 每批扩展插入执行时间都在逐渐增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将 六亿五千万(650000000) 条记录插入到一​​个表中.我尝试过 extend insertLOAD DATA LOCAL INFILE 方法.

I need to insert six hundred fifty million(650000000) records into a single table. I have tried extend insert and LOAD DATA LOCAL INFILE methods.

扩展插入结果:

我已经拆分了数据,每个扩展插入都包含 1250000.我面临的问题是每个批次的执行时间都会逐渐增加.

I have split the data and every extended insert contains 1250000. I'm facing issues like every batch gradually getting increased execution time.

Batch 1 completed in 40s
Batch 2 completed in 50s
Batch 3 completed in 60s
Batch 4 completed in 80s
Batch 5 completed in 100s

完成需要 15-20 小时.

It will take 15-20 hours to complete.

加载 DATA LOCAL INFILE 结果

我有一个 40G 的 CSV 文件,我将它分成 500 个文件.然后我将它加载到Mysql中.在这里我也面临同样的问题,每次文件加载都会增加执行时间.

I have a CSV file size 40G, I split it by 500 files. Then I loaded it in Mysql. Here also I'm facing the same issue, every file load getting increased execution time.

表结构供您参考:

如果是预期行为,有没有其他方法可以快速导入数据?

If it is expected behavior, is there any other way to import the data fastly?

推荐答案

PRIMARY KEY对数据进行排序.使用 sort 命令对 40GB 文件进行排序可能比让 MySQL 进行排序要快.

Sort the data by the PRIMARY KEY. It is likely to be faster to sort a 40GB file with the sort command than to have MySQL do it.

innodb_buffer_pool_size 设置为可用 RAM 的 70% 左右.你有多少内存?

Set innodb_buffer_pool_size to about 70% of available RAM. How much RAM do you have?

PRIMARY KEY 已经建立在桌子上.

没有任何二级索引或外键或触发器.稍后添加它们.是的,这需要时间.

Don't have any secondary indexes or FOREIGN KEYs or Triggers. Add them later. Yes, it takes time.

确认您确实需要所有这些索引.

Confirm that you will really need all those indexes.

选择小于 10 字节的 DECIMAL(20,6) 可以稍微缩小数据.

The data could be shrunk slightly by picking a smaller than the 10-byte DECIMAL(20,6).

这篇关于Mariadb - 每批扩展插入执行时间都在逐渐增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆