正则表达式\p {Punct}错过了java中的unicode标点符号 [英] the regular expression \p{Punct} misses unicode punctuation in java

查看：117 发布时间：2019/1/7 16:38:19 java regex unicode

本文介绍了正则表达式\p {Punct}错过了java中的unicode标点符号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我写了一个小测试来证明

I wrote a little test to demonstrate

@Test
public void missingPunctuationRegex() {
    Pattern punct = Pattern.compile("[\\p{Punct}]");

    Matcher m = punct.matcher("'");
    assertTrue("ascii puctuation", m.find());

    m = punct.matcher("‘");
    assertTrue("unicode puctuation", m.find());
}

第一个断言通过，第二个失败。您可能不得不眯着眼睛看它，但那是左单引号（ U + 2018 ）并且应该作为标点符号覆盖。据我所知。

The first assert passes, and the second one fails. You may have to squint to see it, but that is the 'LEFT SINGLE QUOTATION MARK' (U+2018) and should be covered as a punctuation as far as I can tell.

我如何匹配Java正则表达式中的所有标点符号？

How would I match ALL punctuations in Java regular expressions?

正则表达式\p {Punct}错过了java中的unicode标点符号 [英] the regular expression \p{Punct} misses unicode punctuation in java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

正则表达式\p {Punct}错过了java中的unicode标点符号 [英] the regular expression \p{Punct} misses unicode punctuation in java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭