使用多线程读取单个文件:应该加快速度吗? [英] Reading a single file with Multiple Thread: should speed up?
问题描述
我正在读取一个包含500000行的文件。
我正在测试看多个线程如何加速进程....
I'm reading a file which conatins 500000 rows. I'm testing to see how multiple thread speed up the process....
private void multiThreadRead(int num){
for(int i=1; i<= num; i++) {
new Thread(readIndivColumn(i),""+i).start();
}
}
private Runnable readIndivColumn(final int colNum){
return new Runnable(){
@Override
public void run() {
// TODO Auto-generated method stub
try {
long startTime = System.currentTimeMillis();
System.out.println("From Thread no:"+colNum+" Start time:"+startTime);
RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";
//System.out.println("From Thread no:"+colNum);
while((line = raf.readLine()) != null){
//System.out.println(line);
//System.out.println(StatUtils.getCellValue(line, colNum));
}
long elapsedTime = System.currentTimeMillis() - startTime;
String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);
System.out.println("From Thread no:"+colNum+" Finished Time:"+formattedTime);
}
catch (Exception e) {
// TODO Auto-generated catch block
System.out.println("From Thread no:"+colNum +"===>"+e.getMessage());
e.printStackTrace();
}
}
};
}
private void sequentialRead(int num){
try{
long startTime = System.currentTimeMillis();
System.out.println("Start time:"+startTime);
for(int i =0; i < num; i++){
RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";
while((line = raf.readLine()) != null){
//System.out.println(line);
}
}
long elapsedTime = System.currentTimeMillis() - startTime;
String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);
System.out.println("Finished Time:"+formattedTime);
}
catch (Exception e) {
e.printStackTrace();
// TODO: handle exception
}
}
public TesterClass() {
sequentialRead(1);
this.multiThreadRead(1);
}
对于num = 1我得到以下结果:
for num = 1 I get following result:
开始时间:1326224619049
Start time:1326224619049
完成时间:2分钟,14秒
Finished Time:2 min, 14 sec
顺序读取ENDS ............
Sequential read ENDS...........
多线程读取开始:
来自线程编号:1开始时间:1326224753606
From Thread no:1 Start time:1326224753606
来自线程编号:1完成时间:2分钟,13秒
From Thread no:1 Finished Time:2 min, 13 sec
多线程读取ENDS .....
Multi-Thread read ENDS.....
对于num = 5我得到以下结果:
for num = 5 I get following result:
formatted Time:10 min, 20 sec
Sequential read ENDS...........
Multi-Thread read starts:
From Thread no:1 Start time:1326223509574
From Thread no:3 Start time:1326223509574
From Thread no:4 Start time:1326223509574
From Thread no:5 Start time:1326223509574
From Thread no:2 Start time:1326223509574
From Thread no:4 formatted Time:5 min, 54 sec
From Thread no:2 formatted Time:6 min, 0 sec
From Thread no:3 formatted Time:6 min, 7 sec
From Thread no:5 formatted Time:6 min, 23 sec
From Thread no:1 formatted Time:6 min, 23 sec
Multi-Thread read ENDS.....
我的问题是:不应该多线程读取需要大约。 2.13秒?
你能解释为什么多线程解决方案花了太长时间吗?
My question is: shouldn't multi-threaded read takes approx. 2.13 sec ? Can you please explain why is it taking too long with multi-threaded solution?
提前致谢。
推荐答案
由于文件读取主要是等待磁盘I / O,因此您遇到的问题是磁盘不会因为许多线程使用它而旋转得更快:)
Since file reading is mainly waiting for disk I/O, you have the problem that the disk won't spin faster just because it's used by many threads :)
这篇关于使用多线程读取单个文件:应该加快速度吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!