Java,正则表达式HasNext以空行开头,支持多平台 [英] Java, Regular Expression HasNext starts with empty line, multi-platform support
问题描述
我需要在Unix和Windows上处理以下文件:
I need to process the following file on Unix and Windows:
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
a;b
a;b
c;d;e;f;g
c;d;e;f;g
c;d;e;f;g
我需要处理下面包含一个数据块的a;b
.
例如第三 a;b
不应该被处理.
i need to process a;b
that contain a block of data underneath.
e.g. the third a;b
shouldn't be processed.
当前,我正在使用Java扫描程序通过以下正则表达式来分隔文件中此类文本:
currently i am delimiting by using the following regular expression this type of text in a file using Java scanner:
Scanner fileScanner = new Scanner(file);
try{
fileScanner.useDelimiter(Pattern.compile("^$", Pattern.MULTILINE));
while(fileScanner.hasNext()){
String line;
while ((line = fileScanner.nextLine()).isEmpty());
InputStream is = new ByteArrayInputStream(fileScanner.next().getBytes("UTF-8"));
...
这仍将为 1/3 a;b
委派ByteArrayInputStream的空输入.
This will still delegate for the third a;b
the empty input into the ByteArrayInputStream.
我可以检查fileScanner.next()
的第一行是否为空行,然后执行nextLine()语句和后续的continue语句吗?
Hoe may i check if the first line of fileScanner.next()
is an empty line and then execute nextLine() statement and a following a continue statement?
推荐答案
使用正则表达式模式
(?m)^(?:.+(?:\\r?\\n|\\Z)){2,}
匹配两个或多个非空行,或者换句话说,两个或多个(?:...){2,}
行包含一个或多个字符.+
,后跟新行\\r?\\n
或(?:...|...)
字符串\\Z
的末尾.
which matches two or more non-empty lines, or other words two or more (?:...){2,}
lines that contain one or more characters .+
followed by new line \\r?\\n
or (?:...|...)
end of string \\Z
.
多行修饰符(?m)
表示^
匹配每行的开头,而不仅仅是字符串的开头.
Multiline modifier (?m)
means that ^
matches a beginning of each line, not just the beginning of the string.
String str = "...";
Pattern p = Pattern.compile("(?m)^(?:.+(?:\\r?\\n|\\Z)){2,}");
Matcher m = p.matcher(str);
while (m.find()) {
String match = m.group();
System.out.println(match);
}
请参见 此演示 .
See this demo.
这篇关于Java,正则表达式HasNext以空行开头,支持多平台的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!