Java大数据库插入 [英] Java Large database inserts

查看:128
本文介绍了Java大数据库插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据库,我需要在其中插入批量的数据(一次约500k条记录)。我正在测试德比,并看到插入时间约10-15分钟这个许多记录(我在做一个批处理插入Java)。



这次似乎慢(在一般的笔记本电脑上工作)?是否有办法加快速度?



感谢,



Jeff

解决方案

这段时间似乎是完全合理的,与我观察到的时间是一致的。如果您希望更快,您需要使用批量插入选项并禁用安全功能:




  • 使用PreparedStatements和批次5,000到10,000条记录

  • 暂时停用完整性检查以插入

  • li>暂时停用索引或删除索引并在插入后重新创建
  • 停用事务日志记录,然后重新启用。



编辑:数据库事务受磁盘I / O限制,在笔记本电脑和大多数硬盘驱动器上,重要的数字是磁盘的查找时间。

笔记本电脑的磁盘速度较慢,为5400 rpm。在此速度下,寻道时间约为5ms。如果我们假设每个记录一次搜索(在大多数情况下是过高估计),则插入所有行将花费40分钟(500000 * 5ms)。现在,缓存机制和排序机制的使用有点减少了,但你可以看到问题来自哪里。



我(当然)大大地过度简化了问题,但你可以看到我要用这个;期望数据库以与顺序批量I / O相同的速度执行是不合理的。您必须对记录应用某种索引,这需要时间。


I have a database in which I need to insert batches of data (around 500k records at a time). I was testing with derby and was seeing insert times of about 10-15minutes for this many records (I was doing a batch insert in Java).

Does this time seem slow (working on your average laptop)? And are there approaches to speeding it up?

thanks,

Jeff

解决方案

This time seems perfectly reasonable, and is in agreement with times I have observed. If you want it to go faster, you need use bulk insert options and disable safety features:

  • Use PreparedStatements and batches of 5,000 to 10,000 records unless it MUST be one transaction
  • Use bulk loading options in the DBMS
  • Disable integrity checks temporarily for insert
  • Disable indexes temporarily or delete indexes and re-create them post-insert
  • Disable transaction logging and re-enable afterward.

EDIT: Database transactions are limited by disk I/O, and on laptops and most hard drives, the important number is seek time for the disk.

Laptops tend to have rather slow disks, at 5400 rpm. At this speed, seek time is about 5 ms. If we assume one seek per record (an over-estimate in most cases), it would take 40 minutes (500000 * 5 ms) to insert all rows. Now, the use of caching mechanisms and sequencing mechanisms reduces this somewhat, but you can see where the problem comes from.

I am (of course) vastly oversimplifying the problem, but you can see where I'm going with this; it's unreasonable to expect databases to perform at the same speed as sequential bulk I/O. You've got to apply some sort of indexing to your record, and that takes time.

这篇关于Java大数据库插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆