通过Java从MySQL获取大量记录 [英] Fetching large number of records from MySQL through Java
问题描述
有一个MySQL表,服务器上的用户".它有28行和1百万条记录(它可能还会增加).我想从该表中获取所有行,对其进行一些操作,然后将它们添加到MongoDB中.我知道通过简单的从用户选择*"操作检索这些记录将花费大量时间.我一直在用Java,JDBC进行此操作. 因此,我从研究中得到的选择是:
There is a MySQL table, Users on a Server. It has 28 rows and 1 million records (It may increase as well). I want to fetch all rows from this table, do some manipulation on them and then want to add them to MongoDB. I know that it will take lots of time to retrieve these records through simple 'Select * from Users' operation. I have been doing this in Java, JDBC. So, the options I got from my research is:
选项1.执行批处理:我的计划是从表中获取总行数,即.从用户中选择count(*).然后,将读取大小设置为1000(setFetchSize(1000)).之后,我被困住了.我不知道我是否可以写这样的东西:
Option 1. Do batch processing : My plan was to get the total number of rows from the table, ie. select count(*) from users. Then, set a fetch size of say 1000 (setFetchSize(1000)). After that I was stuck. I did not know if I can write something like this:
Connection conn = DriverManager.getConnection(connectionUrl, userName,passWord);
Statement stmt =conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,java.sql.ResultSet.CONCUR_READ_ONLY);
String query="select * from users";
ResultSet resultSet=stmt.executeQuery(query);
-
我的疑问是,执行查询后resultSet是否会有1000个条目,我是否应该重复执行该操作,直到检索到所有记录.
My doubt was whether resultSet will have 1000 entries once I execute the query and should I repeatedly do the operation till all records are retrieved.
我放弃了计划,因为据我了解,对于MySQL而言,ResultSet会立即完全填充,并且批处理可能无法进行. 此stackoverflow讨论和
I dropped the plan because, I understand that for MySQL, ResultSet is fully populated at once and batching might not work. This stackoverflow discussion and MySQL documentation helped out.
选项2.进行分页:我的想法是设置一个限制,该限制将告知获取的起始索引和获取的偏移量.可以将偏移量设置为1000并遍历索引.
Option 2. Do pagination: My idea is that I will set a Limit which will tell starting index for fetching and offset for fetching. May be, set the offset as 1000 and iterate over the index.
我阅读了建议的文章链接,但是使用Limit解决此问题时未发现任何漏洞.
I read a suggested article link, but did not find any loop holes in approaching this problem using Limit.
任何一个足够友善和耐心地阅读这篇长文章的人,能否请您分享对我的思想过程的宝贵意见,如果有什么错误或遗漏,请纠正我.
Anybody who is kind enough and patient enough to read this long post, could you please share your valuable opinions on my thought process and correct me if there is something wrong or missing.
推荐答案
根据我所做的研究回答自己的问题:
Answering my own question based on the research I did:
-
对于选择查询而言,分批处理并不是真正有效,特别是如果您要使用每个查询操作的结果集.
Batching is not really effective for select queries, especially if you want to use the resultset of each query operation.
分页-如果要提高内存效率,而不是为了提高执行速度,则很好.每次JDBC必须连接到MySQL时,使用Limit触发多个查询时,速度都会降低.
Pagination - Good if you want to improve the memory efficiency, not for improving speed of execution. Speed comes down as you fire multiple queries with Limit, as every time JDBC has to connect to MySQL.
这篇关于通过Java从MySQL获取大量记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-