使用正则表达式匹配utf-8编码中的任何中文字符 [英] Use regular expression to match ANY Chinese character in utf-8 encoding

查看：620 发布时间：2020/7/1 19:47:15 regex unicode flex-lexer non-english

本文介绍了使用正则表达式匹配utf-8编码中的任何中文字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

例如，我想将由m组成的字符串与n汉字匹配，那么我可以使用:

For example, I want to match a string consisting of m to n Chinese characters, then I can use:

[single Chinese character regular expression]{m,n}

单个汉字是否有一些正则表达式，可以是存在的任何汉字?

Is there some regular expression of a single Chinese character, which could be any Chinese characters that exists?

推荐答案

与汉字(CJK)匹配的正则表达式为

The regex to match a Chinese (well, CJK) character is

\p{script=Han}

可以简单地理解为

\p{Han}

这假定您的正则表达式编译器满足要求 RL1.2 来自UTS#18 的属性Unicode正则表达式 . Perl和Java 7都符合该规范，但其他许多不符合.

This assumes that your regex compiler meets requirement RL1.2 Properties from UTS#18 Unicode Regular Expressions. Perl and Java 7 both meet that spec, but many others do not.

这篇关于使用正则表达式匹配utf-8编码中的任何中文字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用正则表达式匹配utf-8编码中的任何中文字符 [英] Use regular expression to match ANY Chinese character in utf-8 encoding

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用正则表达式匹配utf-8编码中的任何中文字符 [英] Use regular expression to match ANY Chinese character in utf-8 encoding

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭