在Java中获取数百万条记录 [英] Fetching millions of records in java

查看:277
本文介绍了在Java中获取数百万条记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

非常开放的问题, 我需要编写一个Java客户端,该客户端从Oracle数据库读取数百万条记录(比如说帐户信息).将其转储为XML,然后通过网络服务将其发送给供应商.

Very Open question, I need to write a java client that reads millions of records (let's say account information) from an Oracle database. Dump it into a XML and send it through webservices to a vendor.

最优化的方法是什么?从获取数百万条记录开始.我走了JPA/休眠路线,因为没有内存错误而无法获取200万条记录.

What is the most optimized way to do this? starting from fetching the millions of records. I Went the JPA/hibernate route I got outofMemory errors fetching 2 million records.

JDBC是更好的方法吗?提取每一行并按需构建XML?还有其他选择吗?

Is JDBC better approach? fetch each row and build the XML as I go? any other alternatives?

我不是Java方面的专家,因此可以提供任何指导.

I am not an expert in Java so any guidance is appreciated.

推荐答案

有时我们遇到类似的问题,我们的记录大小超过2M.这就是我们的处理方式.

We faced similar problem sometime back and our record size was in excess of 2M. This is how we approached.

  • 由于诸如创建大型POJO之类的大量开销,因此根本不排除使用任何OR映射工具,如果将数据转储到XML,则基本上不需要这样做.

  • Using any OR mapping tool is simply ruled out due to large overheads like creation of large POJOs which basically is not required if the data is to be dumped to an XML.

纯JDBC是必经之路.这样做的主要优点是它返回一个ResultSet对象,该对象实际上一次不包含所有结果.这样就解决了将整个数据加载到内存中的问题.当我们遍历ResultSet

Plain JDBC is the way to go. The main advantage of this is that it returns a ResultSet object which actually does not contain all the results at once. So loading of entire data in memory is solved. The data is loaded as we iterate over the ResultSet

接下来是XML文件的创建.我们创建一个XML文件,然后在 Append中打开模式.

Next comes the creation of XML file. We create an XML file and opened than in Append mode.

现在进入循环,在其中循环遍历Resultset对象,我们创建XML片段,然后将其附加到XML文件.一直进行到整个Resultset被迭代为止.

Now in loop where we iterate over Resultset object, we create XML fragments and then append the same to the XML file. This goes on till entire Resultset is iterated.

最后,我们拥有的是XML文件将所有记录.

In the end what we have is XML file will all the records.

现在,为了共享此文件,我们创建了一个Web服务,该服务将在该文件可用时返回该XML文件的URL(已归档/压缩).

Now for sharing this file, we created a web services which would return the URL to this XML file (archived/zipped) if the file is available.

此后,客户端可以随时下载此文件.

The client could download this file anytime after this.

请注意,这不是同步系统,这意味着客户端发出呼叫后该文件将不可用.由于创建XML调用需要花费大量时间,因此HTTP通常会超时,因此采用这种方法.

Note this this is not a synchronous system, meaning The file does not become available after the client makes the call. Since creating XML call takes a lot of time, HTTP wold normally timeout hence this approach.

只是一种您可以借鉴的方法.希望这会有所帮助.

Just an approach you can take clue from. Hope this helps.

这篇关于在Java中获取数百万条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆