Java - 通过字符读取文本文件的最快方法 [英] Java - Fastest Way to Reading Text Files Char by Char

查看:115
本文介绍了Java - 通过字符读取文本文件的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有近500个1000万字的文本文件。我必须索引这些单词。什么是从一个文本文件逐字符读取最快的方式?这是我最初的尝试:

$ p $ InputStream ist = new FileInputStream(this.path +/+ doc);
BufferedReader in = new BufferedReader(new InputStreamReader(ist));

字符串行; ((line = in.readLine())!= null){


$ line = to line.toUpperCase(Locale.english);
String word =; (int j = 0; j <= line.length(); j ++){
char c = line.charAt(j);


// OPERATIONS

}


解决方案

阅读更多:的比较(#)(#)(#)(#)/ /

现在回到原来的问题:

输入字符串:你好吗?

所以你需要索引该行的,即:

  BufferedReader r = new BufferedReader(new InputStreamReader(inputStream)); 
字符串行; ((line = r.readLine())!= null){
String [] splitString = line.split(\\s +);
//在这里做数组的东西,即构造索引。

注意: \\s + 会将字符串中的分隔符作为标签,空格等任何空白字符。


I have nearly 500 text files with 10 million words. I have to index those words. What is the fastest way to read from a text file character by character? Here is my initial attempt:

InputStream ist = new FileInputStream(this.path+"/"+doc);
BufferedReader in = new BufferedReader(new InputStreamReader(ist));

String line;

while((line = in.readLine()) != null){


   line = line.toUpperCase(Locale.ENGLISH);
    String word = "";

    for (int j = 0; j <= line.length(); j++) {
         char  c= line.charAt(j);
     // OPERATIONS

}

解决方案

read() will not give considerable difference in performance.

Read more: Peter Lawery's comparison of read() and readLine()

Now, coming back to your original question:
Input string: hello how are you?
So you need to index the words of the line, i.e.:

BufferedReader r = new BufferedReader(new InputStreamReader(inputStream));
String line;
while ((line = r.readLine()) != null) {
   String[] splitString = line.split("\\s+");
   //Do stuff with the array here, i.e. construct the index.
}

Note: The pattern \\s+ will put delimiter in the string as any whitespace like tab, space etc.

这篇关于Java - 通过字符读取文本文件的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆