Java正则表达式转义字符 [英] Java regex escaped characters

查看:343
本文介绍了Java正则表达式转义字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

匹配某些字符(例如换行符)时,可以使用正则表达式\\ n或实际上只使用\ n。例如,以下内容将字符串拆分为一个行数组:

When matching certain characters (such as line feed), you can use the regex "\\n" or indeed just "\n". For example, the following splits a string into an array of lines:

String[] lines = allContent.split("\\r?\\n");

但以下情况也同样如此:

But the following works just as well:

String[] lines = allContent.split("\r?\n");

我的问题:

执行以上操作两个完全以相同的方式工作,还是有任何微妙的区别?如果是后者,你能举例说明你会得到不同的结果吗?

Do the above two work in exactly the same way, or is there any subtle difference? If the latter, can you give an example case where you get different results?

或者只有[可能/理论]表现存在差异?

Or is there a difference only in [possible/theoretical] performance?

推荐答案

目前的情况没有区别。通常的字符串转义序列是在单个反斜杠的帮助下形成的,然后是有效的转义字符(\ n\ r 等)和正则表达式转义序列是在文字反斜杠的帮助下形成的(即Java字符串中的双反斜杠) literal)和有效的正则表达式转义字符(\\ n\\d ,等等。)。

There is no difference in the current scenario. The usual string escape sequences are formed with the help of a single backslash and then a valid escape char ("\n", "\r", etc.) and regex escape sequences are formed with the help of a literal backslash (that is, a double backslash in the Java string literal) and a valid regex escape char ("\\n", "\\d", etc.).

\ n(一个转义序列)是文字LF(换行符)和\\ n是与LF符号匹配的正则表达式转义序列。

"\n" (an escape sequence) is a literal LF (newline) and "\\n" is a regex escape sequence that matches an LF symbol.

\ r(一个转义序列)是一个文字CR(回车)和\\\\是一个与CR符号匹配的正则表达式转义序列。

"\r" (an escape sequence) is a literal CR (carriage return) and "\\r" is a regex escape sequence that matches an CR symbol.

\t转义序列)是文字标签符号,\\t是一个匹配标签符号的正则表达式转义序列。

"\t" (an escape sequence) is a literal tab symbol and "\\t" is a regex escape sequence that matches a tab symbol.

请参阅正则表达式转义列表的.html#sumrel =nofollow noreferrer> Java正则表达式文档

See the list in the Java regex docs for the supported list of regex escapes.

但是,如果您使用 Pattern.COMMENTS 标志(用于引入注释并很好地格式化模式,使正则表达式引擎忽略模式中所有未转义的空格),您将需要使用\\ n\\\ n定义换行符(LF) Java字符串文字和<$​​ c $ c>\\\\或\\\\\来定义回车(CR)。

However, if you use a Pattern.COMMENTS flag (used to introduce comments and format a pattern nicely, making the regex engine ignore all unescaped whitespace in the pattern), you will need to either use "\\n" or "\\\n" to define a newline (LF) in the Java string literal and "\\r" or "\\\r" to define a carriage return (CR).

查看 Java测试

String s = "\n";
System.out.println(s.replaceAll("\n", "LF")); // => LF
System.out.println(s.replaceAll("\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\n", "<LF>")); 
// => <LF>
//<LF>

为什么最后一个产生< LF> +换行符+ < LF> ?因为(?x)\ n等于,一个空模式,它匹配一个换行前和后面的空格。

Why is the last one producing <LF>+newline+<LF>? Because "(?x)\n" is equal to "", an empty pattern, and it matches an empty space before the newline and after it.

这篇关于Java正则表达式转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆