扫描仪的Java java.util.regex.MatchResult计数器问题 [英] Java java.util.regex.MatchResult counter problems with Scanner

查看：42 发布时间：2021/2/10 18:35:22 java java.util.scanner

本文介绍了扫描仪的Java java.util.regex.MatchResult计数器问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用java.util.Scanner从大字符串中扫描所有出现的给定正则表达式.

I'm using a java.util.Scanner to scan all occurrences of a given regex from a big string.

Scanner sc = new Scanner(body);
sc.useDelimiter("");
String match = "";
while(match!=null)
{
    match = sc.findWithinHorizon(pattern, 0);
    if(match==null)break;
    MatchResult mr = sc.match();
    System.out.println("Match string: "+mr.group());
    System.out.println("Match string using indexes: "+body.substring(mr.start(),mr.end());
}

奇怪的是，经过一定数量的扫描后，group()方法返回正确的结果，而start()和end()方法返回错误的索引，例如扫描从文件开头重新开始. 正则表达式是多行的(我使用此正则表达式来发现行更改"\ r \ n | [\ n \ r \ u2028 \ u2029 \ u0085]").

The strange thing is that after a certain number of scans, group() method returns the correct occurrence while the start() and end() methods return wrong indexes like the scan has restarted from the beginning of the file. The regex is multiline (i use this regex to discover a line change "\r\n|[\n\r\u2028\u2029\u0085]").

您有什么提示吗?可能与水平"参数有关(我已经尝试过使用该值的差分组合)吗?

Do you have any hint? Could it be related to the "horizon" parameter (I've tried differend combinations for that value)?

有关更多详细信息，它似乎与文件的大小有关(超过1000个字符)，大约1000后，计数器从0重新开始(例如，在1003:1020之后出现的第一个错误索引变为3:120).

For more details, it seems related to the dimension of the file (more than 1000 chars), after about 1000 the counter restart from 0 (e.g. the first wrong index occurrence after 1003:1020 becomes 3:120).

扫描仪的Java java.util.regex.MatchResult计数器问题 [英] Java java.util.regex.MatchResult counter problems with Scanner

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

扫描仪的Java java.util.regex.MatchResult计数器问题 [英] Java java.util.regex.MatchResult counter problems with Scanner

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭