Java多线程读取单个大文件 [英] Java multithreading reading a single large file

查看:2441
本文介绍了Java多线程读取单个大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是Java多线程应用程序的有效方法,其中许多线程必须读取完全相同的文件(大小> 1GB)并将其作为输入流公开?我注意到,如果有很多线程(> 32),系统开始竞争I / O,并有很多I / O等待。

What is an efficient way for a Java multithreaded application where many threads have to read the exact same file (> 1GB in size) and expose it as an input stream? I've noticed that if there are many threads (> 32), the system starts to contend over I/O and has a lot of I/O waits.

I考虑将文件加载到所有线程共享的字节数组中 - 每个线程将创建一个ByteArrayInputStream,但分配1GB字节数组将无法正常工作。

I've considered loading the file into a byte array that's shared by all the threads - each thread would create a ByteArrayInputStream, but allocating a 1GB byte array just won't work well.

我也考虑过使用一个FileChannel和每个线程创建一个InputStream在它之上使用Channels.newInputStream(),但它似乎是FileChannel保持InputStream的状态。

I've also considered using a single FileChannel and each thread creating an InputStream on top of it using Channels.newInputStream(), however it seems that it's the FileChannel that's maintaining the state for the InputStream.

推荐答案

在我看来,如果你想避免IO争用,你需要 将文件加载到内存中。操作系统会做一些缓冲,但如果你发现这还不够,你必须自己做。

It seems to me that you're going to have to load the file into memory if you want to avoid IO contention. The operating system will do some buffering, but if you're finding that's not enough, you're going to have to do it yourself.

你真的需要32个线程虽然?大概你没有那么多的核心 - 所以使用更少的线程,你会得到更少的上下文切换等。

Do you really need 32 threads though? Presumably you don't have nearly that many cores - so use fewer threads and you'll get less context switching etc.

你的线程都从头开始处理文件,完?如果是,你能有效地将文件拆分成块吗?读取第一个(说)10MB的数据到内存,让所有的线程处理它,然后移动到下一个10MB等。

Do your threads all process the file from start to finish? If so, could you effectively split the file into chunks? Read the first (say) 10MB of data into memory, let all the threads process it, then move on to the next 10MB etc.

如果这不工作你,有多少内存与文件的大小相比?如果你有足够的内存,但你不想分配一个巨大的数组,你可以读取整个文件到内存,但进入大量的单独的小字节数组。然后你必须编写一个跨越所有这些字节数组的输入流,但这应该是可行的。

If that doesn't work for you, how much memory do you have compared with the size of the file? If you have plenty of memory but you don't want to allocate one huge array, you could read the whole file into memory, but into lots of separate smaller byte arrays. You'd then have to write an input stream which spans all of those byte arrays, but that should be doable.

这篇关于Java多线程读取单个大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆