Java:处理大数据量的建议。 (部分Deux) [英] Java: Advice on handling large data volumes. (Part Deux)

查看:304
本文介绍了Java:处理大数据量的建议。 (部分Deux)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的。所以我有大量的二进制数据(比方说,10GB)分布在不同长度的一堆文件(比方说5000)上。

Alright. So I have a very large amount of binary data (let's say, 10GB) distributed over a bunch of files (let's say, 5000) of varying lengths.

我写的一个Java应用程序来处理这些数据,我希望为数据访问建立一个好的设计。通常会发生这样的事情:

I am writing a Java application to process this data, and I wish to institute a good design for the data access. Typically what will happen is such:


  • 在处理过程中,所有数据都会被读取。

  • 每个文件(通常)按顺序读取,一次只需几千字节。但是,通常需要同时拥有每个文件的前几千字节,或同时每个文件的中间几千字节等。

  • 有时候应用程序会想要随机访问一两个字节。

  • One way or another, all the data will be read during the course of processing.
  • Each file is (typically) read sequentially, requiring only a few kilobytes at a time. However, it is often necessary to have, say, the first few kilobytes of each file simultaneously, or the middle few kilobytes of each file simultaneously, etc.
  • There are times when the application will want random access to a byte or two here and there.

目前我正在使用RandomAccessFile类来读入字节缓冲区(和ByteBuffers)。我的最终目标是将数据访问封装到某个类中,以便它很快,我再也不用担心它了。基本功能是我将要求它从指定文件中读取数据帧,并且我希望在上述考虑因素的情况下最小化I / O操作。

Currently I am using the RandomAccessFile class to read into byte buffers (and ByteBuffers). My ultimate goal is to encapsulate the data access into some class such that it is fast and I never have to worry about it again. The basic functionality is that I will be asking it to read frames of data from specified files, and I wish to minimize the I/O operations given the considerations above.

示例对于典型的访问:


  • 给我前10 KB的所有文件!

  • 给我文件F的字节0到999,然后给我字节1到1000,然后给我2到1001等等,...

  • 从文件F给我一兆字节的数据从这样的字节开始!

对优秀设计的任何建议?

Any suggestions for a good design?

推荐答案

使用Java NIO和MappedByteBuffers,并将文件视为字节数组列表。然后,让操作系统担心缓存,读取,刷新等细节。

Use Java NIO and MappedByteBuffers, and treat your files as a list of byte arrays. Then, let the OS worry about the details of caching, read, flushing etc.

这篇关于Java:处理大数据量的建议。 (部分Deux)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆