Java正则表达式删除所有非字母数字字符EXCEPT空格 [英] Java regular expression to remove all non alphanumeric characters EXCEPT spaces
问题描述
我正在尝试用Java编写正则表达式,删除段落中的所有非字母数字字符,但单词之间的空格除外。
I'm trying to write a regular expression in Java which removes all non-alphanumeric characters from a paragraph, except the spaces between the words.
这是代码我写的:
paragraphInformation = paragraphInformation.replaceAll("[^a-zA-Z0-9\s]", "");
然而,编译器给了我一条错误消息,指出s说它是非法转义字符。在我将\s添加到正则表达式的末尾之前,程序编译好了,但问题是段落中单词之间的空格被删除了。
However, the compiler gave me an error message pointing to the s saying it's an illegal escape character. The program compiled OK before I added the \s to the end of the regular expression, but the problem with that was that the spaces between words in the paragraph were stripped out.
如何修复此错误?
推荐答案
您需要双重转义 \
字符:[^ a-zA-Z0-9 \\ s]
You need to double-escape the \
character: "[^a-zA-Z0-9\\s]"
Java会将 \s
解释为Java String转义字符,这确实是一个无效的Java转义符。通过编写 \\
,您可以转义 \
字符,基本上只发送一个 \
正则表达式的字符。此 \
然后成为正则表达式转义字符 \s
的一部分。
Java will interpret \s
as a Java String escape character, which is indeed an invalid Java escape. By writing \\
, you escape the \
character, essentially sending a single \
character to the regex. This \
then becomes part of the regex escape character \s
.
这篇关于Java正则表达式删除所有非字母数字字符EXCEPT空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!