为什么FileInputStream读取的数据越大越好 [英] Why is it that FileInputStream read is slower with bigger array

查看:273
本文介绍了为什么FileInputStream读取的数据越大越好的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我将文件中的字节读入byte [],我发现当数组大约1 MB而不是128 KB时,FileInputStream的性能会更差。在我测试的2个工作站上,它几乎是128 KB的两倍。为什么?

  import java.io. *; 

public class ReadFileInChuncks
{
public static void main(String [] args)throws IOException
{
byte [] buffer1 = new byte [1024 * 128];
byte [] buffer2 = new byte [1024 * 1024];

String path =some 1 gb big file;

readFileInChuncks(path,buffer1,false);

readFileInChuncks(path,buffer1,true);
readFileInChuncks(path,buffer2,true);
readFileInChuncks(path,buffer1,true);
readFileInChuncks(path,buffer2,true);
}

public static void readFileInChuncks(String path,byte [] buffer,boolean report)throws IOException
{
long t = System.currentTimeMillis();

InputStream = new FileInputStream(path);
while((readToArray(is,buffer))!= 0){}

if(report)
System.out.println((System.currentTimeMillis() - t )+ms);
}

public static int readToArray(InputStream is,byte [] buffer)抛出IOException
{
int index = 0;
while(index!= buffer.length)
{
int read = is.read(buffer,index,buffer.length - index);
if(read == -1)
break;
index + = read;
}
返回索引;
}
}

输出

  422 ms 
717 ms
422 ms
718 ms

请注意,这是对已发布问题的重新定义。另一方受到无关讨论的污染。我会将另一个标记为删除。



编辑:重复,真的吗?我确定可以制作一些更好的代码来证明我的观点,但:

  if(len == 0){
return 0 ;
}否则if(len> BUF_SIZE){
buf = malloc(len);
if(buf == NULL){
JNU_ThrowOutOfMemoryError(env,NULL);
返回0;
}
} else {
buf = stackBuf;
}

这里BUF_SIZE == 8192.如果缓冲区大于此保留堆栈区域,临时缓冲区由 malloc 分配。在Windows上 malloc 通常通过 HeapAlloc WINAPI电话。



接下来,我测量了 HeapAlloc + HeapFree 单独调用没有文件I / O.结果很有意思:

  128K:5μs
256K:10μs
384K:15μs
512K:20μs
640K:25μs
768K:29μs
896K:33μs
1024K:316μs< - 差不多10倍跳跃
1152K:356μs
1280K:399μs
1408K:436μs
1536K:474μs
1664K:511μs
1792K:553μs
1920K: 592μs
2048K:628μs

如您所见,OS内存分配的性能在1MB边界处急剧变化。这可以通过用于小块和大块的不同分配算法来解释。



更新



HeapCreate 的文档确认关于大于1MB的块的特定分配策略的想法(参见 dwMaximumSize 描述)。


此外,可以从堆分配的最大内存块对于32位进程略小于512 KB,对于64位进程略小于1,024 KB。



...



分配大于固定大小堆限制的内存块的请求不会自动失败;相反,系统调用VirtualAlloc函数来获取大块所需的内存。



If I read bytes from a file into a byte[] I see that FileInputStream performance worse when the array is around 1 MB compared to 128 KB. On the 2 workstations I have tested it is almost twice as fast with 128 KB. Why is that?

import java.io.*;

public class ReadFileInChuncks 
{
    public static void main(String[] args) throws IOException 
    {
        byte[] buffer1 = new byte[1024*128];
        byte[] buffer2 = new byte[1024*1024];

        String path = "some 1 gb big file";

        readFileInChuncks(path, buffer1, false);

        readFileInChuncks(path, buffer1, true);
        readFileInChuncks(path, buffer2, true);
        readFileInChuncks(path, buffer1, true);
        readFileInChuncks(path, buffer2, true);
    }

    public static void readFileInChuncks(String path, byte[] buffer, boolean report) throws IOException
    {
        long t = System.currentTimeMillis();

        InputStream is = new FileInputStream(path);
        while ((readToArray(is, buffer)) != 0) {}

        if (report)
            System.out.println((System.currentTimeMillis()-t) + " ms");
    }

    public static int readToArray(InputStream is, byte[] buffer) throws IOException
    {
        int index = 0;
        while (index != buffer.length)
        {
            int read = is.read(buffer, index, buffer.length - index);
            if (read == -1)
                break;
            index += read;
        }
        return index;
    }
}

outputs

422 ms 
717 ms 
422 ms 
718 ms

Notice this is a redefinition of an already posted question. The other was polluted with unrelated discussions. I will mark the other for deletion.

Edit: Duplicate, really? I sure could make some better code to proof my point, but this does not answer my question

Edit2: I ran the test with every buffer between 5 KB and 1000 KB on
Win7 / JRE 1.8.0_25 and the bad performance starts at precis 508 KB and all subsequent. Sorry for the bad diagram legions, x is buffer size, y is milliseconds

解决方案

TL;DR The performance drop is caused by memory allocation, not by file reading issues.

A typical benchmarking problem: you benchmark one thing, but actually measure another.

First of all, when I rewrote the sample code using RandomAccessFile, FileChannel and ByteBuffer.allocateDirect, the threshold has disappeared. File reading performance became roughly the same for 128K and 1M buffer.

Unlike direct ByteBuffer I/O FileInputStream.read cannot load data directly into Java byte array. It needs to get data into some native buffer first, and then copy it to Java using JNI SetByteArrayRegion function.

So we have to look at the native implementation of FileInputStream.read. It comes down to the following piece of code in io_util.c:

    if (len == 0) {
        return 0;
    } else if (len > BUF_SIZE) {
        buf = malloc(len);
        if (buf == NULL) {
            JNU_ThrowOutOfMemoryError(env, NULL);
            return 0;
        }
    } else {
        buf = stackBuf;
    }

Here BUF_SIZE == 8192. If the buffer is larger than this reserved stack area, a temporary buffer is allocated by malloc. On Windows malloc is usually implemented via HeapAlloc WINAPI call.

Next, I measured the performance of HeapAlloc + HeapFree calls alone without file I/O. The results were interesting:

     128K:    5 μs
     256K:   10 μs
     384K:   15 μs
     512K:   20 μs
     640K:   25 μs
     768K:   29 μs
     896K:   33 μs
    1024K:  316 μs  <-- almost 10x leap
    1152K:  356 μs
    1280K:  399 μs
    1408K:  436 μs
    1536K:  474 μs
    1664K:  511 μs
    1792K:  553 μs
    1920K:  592 μs
    2048K:  628 μs

As you can see, the performance of OS memory allocation drastically changes at 1MB boundary. This can be explained by different allocation algorithms used for small chunks and for large chunks.

UPDATE

The documentation for HeapCreate confirms the idea about specific allocation strategy for blocks larger than 1MB (see dwMaximumSize description).

Also, the largest memory block that can be allocated from the heap is slightly less than 512 KB for a 32-bit process and slightly less than 1,024 KB for a 64-bit process.

...

Requests to allocate memory blocks larger than the limit for a fixed-size heap do not automatically fail; instead, the system calls the VirtualAlloc function to obtain the memory that is needed for large blocks.

这篇关于为什么FileInputStream读取的数据越大越好的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆