逐行读取STDIN的最快方法? [英] Fastest way for line-by-line reading STDIN?

查看:64
本文介绍了逐行读取STDIN的最快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找最省时的方式逐行读取STDIN.

I'm looking for the most time-efficient way to read STDIN line-by-line.

第一行是要测试的条件数. 以下所有行都是条件(字符串),最多100000个字符.

The first line is the number of conditions to test. All the following lines are conditions (strings) with a maximum of 100 000 characters.

我已经尝试了以下方法(加上4次90 000个字符的结果:

I have already tried the following (plus result for 4 times 90 000 characters:

  • 扫描仪带有while循环(7255毫秒)

  • Scanner with a while-loop (7255 ms)

Scanner sc = new Scanner(System.in);
int numberOfLines = Integer.parseInt(sc.nextLine());
long start = 0;
int i = 1;
while (i<=numberOfLines){
    start = System.currentTimeMillis();
    sc.nextLine();
    Debug.println((System.currentTimeMillis()-start) + "ms for scanner while");
    i++;
}

  • 结果:

    • Results :

      1. 3228毫秒的扫描器时间
      2. 扫描仪时为2264毫秒
      3. 扫描仪时为1​​309毫秒
      4. 扫描仪同时运行454毫秒

    • 具有for循环功能的扫描仪(7078毫秒)

      Scanner with a for-loop (7078 ms)

      Scanner sc = new Scanner(System.in);
      int numberOfLines = Integer.parseInt(sc.nextLine());
      long start = 0;
      for (int i = 1; i<= numberOfLines;i++){
          start = System.currentTimeMillis();
          sc.nextLine();
          Debug.println((System.currentTimeMillis()-start) + "ms for scanner for");
          //i++;     
      }
      

      • 结果:

        • Results :

          1. 3168毫秒用于扫描器
          2. 2207毫秒用于扫描器
          3. 扫描仪的1236毫秒用于
          4. 467ms用于扫描器

        • 带for循环(7403 ms)的BufferedReader

          BufferedReader with a for-loop (7403 ms)

          try {
          BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
          
          int numberOfLines = Integer.parseInt(br.readLine());
          long start = 0;
          for (int i = 0; i< numberOfLines;i++){
              start = System.currentTimeMillis();
              br.readLine();
              Debug.println((System.currentTimeMillis()-start) + "ms for bufferreader for");
              //i++;
          }
           } catch (Exception e) {
          System.err.println("Error:" + e.getMessage());
          

          }

          • 结果:
          1. 用于缓冲读取器的时间为3273ms
          2. bufferreader的时间为2330ms
          3. 用于缓冲区读取器的1293毫秒用于
          4. 缓冲读卡器的时间为507ms

          带while循环(7461 ms)的BufferedReader

          BufferedReader with a while-loop (7461 ms)

          try {
          BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
          
          int numberOfLines = Integer.parseInt(br.readLine());
          int i=0;
          long start = 0;
          while(i< numberOfLines){
              start = System.currentTimeMillis();
              br.readLine();
              Debug.println((System.currentTimeMillis()-start) + "ms for bufferreader while");
              i++;
          }
           } catch (Exception e) {
          System.err.println("Error:" + e.getMessage());
          

          }

          • 结果:
          1. 3296毫秒用于bufferreader,
          2. 2358毫秒用于bufferreader,
          3. 缓冲读取器时为1307毫秒
          4. 500毫秒缓冲读取器时间

          在调试所花费的时间时,我注意到每次读取后所花费的时间都会减少. 是否可以限制初始化的字节(fe:如果您最多有100.000个字符,请将扫描器/缓冲读取器限制为仅初始化100 000个字符.读取后,将需要用接下来的100 000个字符重新填充自身)

          While debugging the time taken, i noticed that the time-taken decreases after each read. Is it possible to restrict the bytes that are initialized (f.e. : If you have a maximum of 100.000 chars, limit the scanner/bufferedreader to only initialize 100 000 chars. After a read it will need to refill itself with the next 100 000 chars)

          任何有关此事的想法都值得欢迎.

          Any ideas on this matter are more than welcome.

          添加了每种方案的代码以及每行读取所花费的时间.还将100.000更改为100000,以便于阅读.

          EDIT : Added the code for each scenario along with the time-taken per line read. Also changed 100.000 to 100 000 to read more easily.

          推荐答案

          BufferedReader#readLine源内部查找.我看到了几个问题:

          Looked inside BufferedReader#readLine source. There're several problems I see:

          1. 它使用StringBuffer代替StringBuilder,这会产生同步开销.
          2. 似乎还有数据复制的开销-不能完全确定,最好检查一下.
          3. BufferedReader中的专用监视对象以及更多的同步开销.

          您可能会碰两件事:

          1. 编写自己的缓冲,这可以节省一些数据双重复制的时间.
          2. 编写自己的nextLine方法,该方法将使用StringBuilder并以简单的周期遍历源数据.

          这篇关于逐行读取STDIN的最快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆