使用多线程读取单个文件:应该加快速度吗? [英] Reading a single file with Multiple Thread: should speed up?

查看:140
本文介绍了使用多线程读取单个文件:应该加快速度吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在读取一个包含500000行的文件。
我正在测试看多个线程如何加速进程....

I'm reading a file which conatins 500000 rows. I'm testing to see how multiple thread speed up the process....

private void multiThreadRead(int num){

    for(int i=1; i<= num; i++) { 
        new Thread(readIndivColumn(i),""+i).start(); 
     } 
}

private Runnable readIndivColumn(final int colNum){
    return new Runnable(){
        @Override
        public void run() {
            // TODO Auto-generated method stub
            try {

                long startTime = System.currentTimeMillis();
                System.out.println("From Thread no:"+colNum+" Start time:"+startTime);

                RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
                String line = "";
                //System.out.println("From Thread no:"+colNum);

                while((line = raf.readLine()) != null){
                    //System.out.println(line);
                    //System.out.println(StatUtils.getCellValue(line, colNum));
                }


                long elapsedTime = System.currentTimeMillis() - startTime;

                String formattedTime = String.format("%d min, %d sec",  
                        TimeUnit.MILLISECONDS.toMinutes(elapsedTime), 
                        TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -  
                        TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime)) 
                    );

                System.out.println("From Thread no:"+colNum+" Finished Time:"+formattedTime);
            } 
            catch (Exception e) {
                // TODO Auto-generated catch block
                System.out.println("From Thread no:"+colNum +"===>"+e.getMessage());

                e.printStackTrace();
            }
        }
    };
}

private void sequentialRead(int num){
    try{
        long startTime = System.currentTimeMillis();
        System.out.println("Start time:"+startTime);

        for(int i =0; i < num; i++){
            RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
            String line = "";

            while((line = raf.readLine()) != null){
                //System.out.println(line);
            }               
        }

        long elapsedTime = System.currentTimeMillis() - startTime;

        String formattedTime = String.format("%d min, %d sec",  
                TimeUnit.MILLISECONDS.toMinutes(elapsedTime), 
                TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -  
                TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime)) 
            );

        System.out.println("Finished Time:"+formattedTime);
    }
    catch (Exception e) {
        e.printStackTrace();
        // TODO: handle exception
    }

}
    public TesterClass() {

    sequentialRead(1);      
    this.multiThreadRead(1);

}

对于num = 1我得到以下结果:

for num = 1 I get following result:

开始时间:1326224619049

Start time:1326224619049

完成时间:2分钟,14秒

Finished Time:2 min, 14 sec

顺序读取ENDS ............

Sequential read ENDS...........

多线程读取开始:

来自线程编号:1开始时间:1326224753606

From Thread no:1 Start time:1326224753606

来自线程编号:1完成时间:2分钟,13秒

From Thread no:1 Finished Time:2 min, 13 sec

多线程读取ENDS .....

Multi-Thread read ENDS.....

对于num = 5我得到以下结果:

for num = 5 I get following result:

    formatted Time:10 min, 20 sec

Sequential read ENDS...........

Multi-Thread read starts:

From Thread no:1 Start time:1326223509574
From Thread no:3 Start time:1326223509574
From Thread no:4 Start time:1326223509574
From Thread no:5 Start time:1326223509574
From Thread no:2 Start time:1326223509574
From Thread no:4 formatted Time:5 min, 54 sec
From Thread no:2 formatted Time:6 min, 0 sec
From Thread no:3 formatted Time:6 min, 7 sec
From Thread no:5 formatted Time:6 min, 23 sec
From Thread no:1 formatted Time:6 min, 23 sec
Multi-Thread read ENDS.....

我的问题是:不应该多线程读取需要大约。 2.13秒?
你能解释为什么多线程解决方案花了太长时间吗?

My question is: shouldn't multi-threaded read takes approx. 2.13 sec ? Can you please explain why is it taking too long with multi-threaded solution?

提前致谢。

推荐答案

由于文件读取主要是等待磁盘I / O,因此您遇到的问题是磁盘不会因为许多线程使用它而旋转得更快:)

Since file reading is mainly waiting for disk I/O, you have the problem that the disk won't spin faster just because it's used by many threads :)

这篇关于使用多线程读取单个文件:应该加快速度吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆