将大量记录MySQL读入Java [英] Reading large amount of records MySQL into Java

查看:159
本文介绍了将大量记录MySQL读入Java的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

拥有一个我需要处理的带有+8百万条记录的MySQL数据库(这在数据库本身无法完成),我在尝试将它们读入我的Java应用程序时会遇到问题。

Having a MySQL database with +8 million records that I need to process (that can't be done in the database itself), I encounter issues when trying to read them into my Java application.

我已经尝试过一些有类似问题的人的解决方案(例如, link )但是,没有一个对我有用。我试图设置FetchSize和所有,但没有运气!我的应用程序是使用BlockingQueue构建的,Producer从数据库中连续读取数据,将其存储在队列中,以便Consumer可以处理它。这样我就可以同时限制主内存中的记录数量。

I already tried some solutions of people with similar problems (eg., link) however, none have worked out for me. I tried to set the FetchSize and all, but no luck! My application is built making use of a BlockingQueue of which the Producer reads data continously from the database, stores it in the queue so the Consumer can process it. This way I limit the amount of records in main memory at the same time.

我的代码适用于少量记录(我测试了1000条记录)所以我建议需要修复从数据库到我的应用程序的费用。

My code works for small amount of records (I tested for 1000 records) so I suggest the fase from database to my application needs to be fixed.

Edit1

connection = ConnectionFactory.getConnection(DATABASE);
preparedStatement = connection.prepareStatement(query, java.sql.ResultSet.CONCUR_READ_ONLY, java.sql.ResultSet.TYPE_FORWARD_ONLY);
preparedStatement.setFetchSize(1000); 
preparedStatement.executeQuery();
rs = preparedStatement.getResultSet();

Edit2

最终,除了看到我的记忆力下降之外,我得到了一些输出。我收到此错误:

Eventually now I get some output other than seeing my memory go down. I get this error:

Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space
at com.mysql.jdbc.Buffer.<init>(Buffer.java:59)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:2089)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3554)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:491)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3245)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2413)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2836)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2828)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2777)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1651)
at razoralliance.dao.DataDAOImpl.getAllDataRS(DataDAOImpl.java:38)
at razoralliance.app.DataProducer.run(DataProducer.java:34)
at java.lang.Thread.run(Thread.java:722)

Edit3

我围绕Producer-Consumer模式做了一些研究,结果发现,当消费者无法跟上Producer,队列会自动放大,最终耗尽内存。所以我切换到ArrayBlockingQueue,这使得大小固定。但是,我仍然得到记忆。 Eclipse Memory Analyzer表示,ArrayBlockingQueue占用了我内存的65.31%,而内存中只有1000个对象,所有文本都有4个字段。

I did some more research around the Producer-Consumer pattern and it turns out that, when the Consumer can not keep up with the Producer, the queue will automatically enlarge thus eventually run out of memory. So I switched to ArrayBlockingQueue which makes the size fixed. However, I still get memoryleaks. Eclipse Memory Analyzer says that ArrayBlockingQueue occupies 65,31% of my memory while it only has 1000 objects in memory with 4 fields all text.

推荐答案

您需要流式传输结果。使用MySQL驱动程序,您必须为 ResultSet设置 CONCUR_READ_ONLY TYPE_FORWARD_ONLY 。另外,相应地设置提取大小: stmt.setFetchSize(Integer.MIN_VALUE);

You will need to stream your results. With the MySQL driver it appears you have to set CONCUR_READ_ONLY and TYPE_FORWARD_ONLY for your ResultSet. Also, set the fetch size accordingly: stmt.setFetchSize(Integer.MIN_VALUE);


默认情况下,ResultSet完全检索并存储在内存中。在大多数情况下,这是最有效的操作方式,并且由于MySQL网络协议的设计更容易实现。如果您正在使用具有大量行或大值的ResultSet,并且无法在JVM中为所需内存分配堆空间,则可以告诉驱动程序一次将结果流回一行。

By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.

要启用此功能,请按以下方式创建Statement实例:

To enable this functionality, create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE);

只进,只读结果集的组合,获取大小为Integer.MIN_VALUE的信号用作驱动程序逐行传输结果集的信号。在此之后,将逐行检索使用该语句创建的任何结果集。

The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.

这种方法有一些警告......



查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆