多线程文件读取 [英] Multithread file reading
问题描述
我正在尝试用Java编写代码以通过几个线程读取文件并计算其中的单词数.每个线程应读取不同的行.它可以很好地计数单词(当我让1个线程运行时),但是我的线程正在读取同一行并同时增加行计数器.我确定read方法中的synchronized
关键字会解决它,但是没有解决.我应该怎么做才能解决它?
I'm trying to write code in Java to read a file by couple of threads and count the words in them. Each thread should read different lines. It counts words well (when I let 1 thread run) but my threads are reading same line and increments line counter at the same time. I was sure that the synchronized
keyword in read method will fix it,but it didn't. What should I do to fix it?
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
public class WordCounterr implements Runnable {
private static Hashtable<String, Integer> ht = new Hashtable<String, Integer>();
private int lineCounter;
private String path;
private int tNumber;
//private final AtomicInteger whichLine = new AtomicInteger();
private static int whichLine;
private static boolean flag;
public WordCounterr(String path,int num){
lineCounter = 0;
//whichLine = 0;
flag= false;
this.path=path;
tNumber = num;
}
public void countWords(String s) throws IOException{
char[] c = s.toCharArray();
String str="";
char ch;
for(int k=0;k<c.length;k++){
ch=c[k];
if((ch>40 && ch<91) ||(ch>96 && ch<123)){
if(ch>40 && ch<91)
ch+=32;
str+=ch;
}
else if(ch==32 ||k==c.length-1){
if(str.length()>1){ //sprawdzamy czy funkcja znalazla juz
if(ht.containsKey(str)) //takie slowo
ht.put(str,ht.get(str)+1); //znalazla - powiekszamy wartosc przy kluczu
else
ht.put(str,1); //nie znalazla - dodajemy slowo do Hashtable
}
str="";
}
}
}
public synchronized void read(String path) throws IOException{
BufferedReader buf=new BufferedReader(new FileReader(path));
String linia ;
for(int i=0;i<whichLine;i++){
linia=buf.readLine();
}
if((linia=buf.readLine())!=null){
System.out.println(linia);
countWords(linia);
lineCounter++;
System.out.println("watek nr:"+tNumber+"ktora linia:"+whichLine);
whichLine++;
/*try{
Thread.sleep(100);
}catch(InterruptedException el){
System.out.println(el.toString());
}*/
} else
setFlag(true);
buf.close(); //pamietamy o zamknieciu pliku
}
public synchronized void print(){
if(getFlag()){
setFlag(false);
System.out.println(ht);
}
System.out.println("watek nr: "+tNumber+", przeanalizowano "+ lineCounter+ "linii tekstu");
}
public void setFlag(boolean val){
flag=val;
}
public boolean getFlag(){
return flag;
}
@Override
public void run() {
try{
while(getFlag()==false) {
read(path);
Thread.yield(); //let other thread read
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}catch(IOException ex){
System.out.println(ex.toString());
}//catch(InterruptedException el){
// System.out.println(el.toString());
//}
print();
}
public static void main(String[] args) throws IOException, InterruptedException{
String path = args[0];
int tNum = Integer.parseInt(args[1]);
Thread[] thread = new Thread[tNum]; // tablica w?tków
for (int i = 0; i < tNum; i++){
thread[i] =new Thread(new WordCounterr(path,i));
}
for (int i = 0; i < tNum; i++)
thread[i].start();
}
}
推荐答案
我猜想它仍然将无法有效地读取文件内容. 尝试更改同步点.应该将其置于读取方法中.此方法读取整个文件内容.而是尝试同步,仅读取此文件的下一行.您可以通过为每个WordCounterr实例提供相同的阅读器文件实例,并仅同步将指针移至下一行的过程来读取该行的内容来实现此目的.行中的单词计数无需同步即可完成,只有更新的HashTable应该同步. 可以并行同步读取文件内容,如下所示:
I am guessing that it still will be inefficiently reading file content. Try change the synchronization point. It should be to placed in read method. This method reads whole file content. Rather that try synchronized just reading next line of this file. You can achieve it by putting to each WordCounterr instance the same reader file instance and synchronized only process of moving pointer to next line read content of this line. Counting words in the line can be done without synchronization and only updating HashTable should be synchronized. Reading file content in parallel can be synchronized as below:
static class Reader implements Runnable {
int lineReaded = 0;
final Scanner scanner;
Reader(Scanner scanner) {
this.scanner = scanner;
}
public void run() {
boolean hasNext = true;
while (hasNext) {
hasNext = false;
synchronized (scanner) {
if (scanner.hasNext()) {
hasNext = true;
String line = scanner.nextLine();
++lineReaded;
}
}
try {
Thread.sleep((long) (Math.random() * 100));
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
这篇关于多线程文件读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!