使用JDBC迭代大表的最快方法 [英] Fastest way to iterate through large table using JDBC
问题描述
我正在尝试创建一个java程序来清理和合并表中的行。该表很大,大约500k行,我目前的解决方案运行速度非常慢。我想要做的第一件事就是获取一个表示我表中所有行的对象的内存数组。以下是我正在做的事情:
I'm trying to create a java program to cleanup and merge rows in my table. The table is large, about 500k rows and my current solution is running very slowly. The first thing I want to do is simply get an in-memory array of objects representing all the rows of my table. Here is what I'm doing:
- 一次选择1000行增量
- 使用JDBC在以下SQL查询中获取结果集
SELECT * FROM TABLE WHERE ID> 0 AND ID< 1000 - 将结果数据添加到内存数组中
- 继续查询最多500,000,增量为1000,每次添加结果。
- pick an increment of say 1000 rows at a time
- use JDBC to fetch a resultset on the following SQL query SELECT * FROM TABLE WHERE ID > 0 AND ID < 1000
- add the resulting data to an in-memory array
- continue querying all the way up to 500,000 in increments of 1000, each time adding results.
这已经取消了很长时间。实际上它甚至没有超过从1000到2000的第二个增量。查询需要永远完成(尽管当我直接通过MySQL浏览器运行相同的东西时它速度相当快)。我已经有一段时间了,因为我直接使用了JDBC。是否有更快的替代方案?
This is taking way to long. In fact its not even getting past the second increment from 1000 to 2000. The query takes forever to finish (although when I run the same thing directly through a MySQL browser its decently fast). Its been a while since I've used JDBC directly. Is there a faster alternative?
推荐答案
首先,您确定需要内存中的整个表吗?也许您应该考虑(如果可能)选择要更新/合并/等的行。如果你真的必须拥有整个表,你可以考虑使用可滚动的ResultSet。您可以像这样创建它。
First of all, are you sure you need the whole table in memory? Maybe you should consider (if possible) selecting rows that you want to update/merge/etc. If you really have to have the whole table you could consider using a scrollable ResultSet. You can create it like this.
// make sure autocommit is off (postgres)
con.setAutoCommit(false);
Statement stmt = con.createStatement(
ResultSet.TYPE_SCROLL_INSENSITIVE, //or ResultSet.TYPE_FORWARD_ONLY
ResultSet.CONCUR_READ_ONLY);
ResultSet srs = stmt.executeQuery("select * from ...");
它允许您使用绝对和相对方法移动到您想要的任何行。
It enables you to move to any row you want by using 'absolute' and 'relative' methods.
这篇关于使用JDBC迭代大表的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!