\ s实际上并未捕获所有空格字符 [英] \s doesn't actually capture all whitespace characters

查看:89
本文介绍了\ s实际上并未捕获所有空格字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的Java 8应用程序中,我正在扫描传入的文本中的空格.但是,我的正则表达式中的\s不能捕获所有空格.我发现到目前为止,在我的测试中尚未捕获的一个空格是非-破坏空间(Unicode 00A0).这是我正则表达式遇到的问题:

In my Java 8 app, I am scanning for whitespaces in text passed in. But \s in my Regular Expression doesn't capture all whitespaces. The one whitespace that I've found that it doesn't capture so far in my testing is Non-breaking Space (Unicode 00A0). This was my regular expression that was running into that issue:

Pattern p = Pattern.compile("\\s");

为解决此问题,我在我的正则表达式中添加了\h:

To solve this, I added \h to my Regular Expression:

Pattern p = Pattern.compile("[\\s\\h]");

现在,我是否还需要注意其他空格?\s\h不会捕获空格?

Now, are there any other whitespaces that I need to be aware of that wont be captured by \s\h?

推荐答案

根据

According to the Pattern class documentation the characters that match \s are \t\n\x0B\f\r.

但是,Unicode确实支持更多空格字符.示例包括:

However, Unicode indeed supports a whole lot more space characters. Examples include:

  • \u2002:在太空中
  • \u2003:Em空间
  • \u2003:薄空间
  • \u202F:狭窄的不间断空间
  • \u2002: En space
  • \u2003: Em space
  • \u2003: Thin space
  • \u202F: Narrow no-break space

这篇关于\ s实际上并未捕获所有空格字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆