Java正则表达式删除所有非字母数字字符EXCEPT空格 [英] Java regular expression to remove all non alphanumeric characters EXCEPT spaces

查看:531
本文介绍了Java正则表达式删除所有非字母数字字符EXCEPT空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用Java编写正则表达式,删除段落中的所有非字母数字字符,但单词之间的空格除外。

I'm trying to write a regular expression in Java which removes all non-alphanumeric characters from a paragraph, except the spaces between the words.

这是代码我写的:

paragraphInformation = paragraphInformation.replaceAll("[^a-zA-Z0-9\s]", "");

然而,编译器给了我一条错误消息,指出s说它是非法转义字符。在我将\s添加到正则表达式的末尾之前,程序编译好了,但问题是段落中单词之间的空格被删除了。

However, the compiler gave me an error message pointing to the s saying it's an illegal escape character. The program compiled OK before I added the \s to the end of the regular expression, but the problem with that was that the spaces between words in the paragraph were stripped out.

如何修复此错误?

推荐答案

您需要双重转义 \ 字符:[^ a-zA-Z0-9 \\ s]

You need to double-escape the \ character: "[^a-zA-Z0-9\\s]"

Java会将 \s 解释为Java String转义字符,这确实是一个无效的Java转义符。通过编写 \\ ,您可以转义 \ 字符,基本上只发送一个 \ 正则表达式的字符。此 \ 然后成为正则表达式转义字符 \s 的一部分。

Java will interpret \s as a Java String escape character, which is indeed an invalid Java escape. By writing \\, you escape the \ character, essentially sending a single \ character to the regex. This \ then becomes part of the regex escape character \s.

这篇关于Java正则表达式删除所有非字母数字字符EXCEPT空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆