控制字符的正则表达式是什么? [英] What is a regular expression for control characters?

查看:291
本文介绍了控制字符的正则表达式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试匹配\ ^ c形式的控制字符,其中c是控制字符的任何有效字符。我有这个正则表达式,但它目前无法正常工作: \\ [^] [@ - z]

I'm trying to match a control character in the form \^c where c is any valid character for control characters. I have this regular expression, but it's not currently working: \\[^][@-z]

我认为问题在于插入符号(^)是正则表达式解析引擎的一部分。

I think the problem lies with the fact that the caret character (^) is part of the regular expressions parsing engine.

推荐答案

使用模式 \ ^。匹配 ^ X 形式的ASCII文本字符串,仅此而已。将 \ ^ X 形式的ASCII文本字符串与模式 \\\ ^。匹配。您可能希望将该点限制为 [?@_ \ [\] ^ \\] ,因此 \\\\ \\ ^ [AZ?@_ \ [\] ^ \\] 。对于括号中的字符类,它更容易被读作 [?\ x40-\ x5F] ,因此 \\\ ^ [? \ xx40-\ x5F] 用于文字BACKSLASH,后跟文字CIRCUMFLEX,后跟变成有效控制字符之一。

Match an ASCII text string of the form ^X using the pattern \^., nothing more. Match an ASCII text string of the form \^X with the pattern \\\^.. You may wish to constrain that dot to [?@_\[\]^\\], so \\\^[A-Z?@_\[\]^\\]. It’s easier to read as [?\x40-\x5F] for the bracketed character class, hence \\\^[?\x40-\x5F] for a literal BACKSLASH, followed by a literal CIRCUMFLEX, followed by something that turns into one of the valid control characters.

请注意,这是打印出模式或从文件中读取的内容的结果。这是你需要传递给正则表达式编译器。如果你把它作为一个字符串文字,你当然必须加倍每个反斜杠。 `\\\\\\ ^ [?\\x40-\\ x5F]是的,看起来很疯狂,但是这是因为Java不支持正则表达式直接作为Groovy和Scala - 或者Perl和Ruby - 做。正则表达式工作总是更容易,没有额外的bbaacckksslllllaasshheesssssess。:)

Note that that is the result of printing out the pattern, or what you’d read from a file. It’s what you need to pass to the regex compiler. If you have it as a string literal, you must of course double each of those backslashes. `\\\\\\^[?\\x40-\\x5F]" Yes, it is insane looking, but that is because Java does not support regexes directly as Groovy and Scala — or Perl and Ruby — do. Regex work is always easier without the extra bbaacckksslllllaasshheesssssess. :)

如果你有真正的控制字符而不是间接表示它们,你可以使用 \ pC 获取属性GC = Other或 \\的所有文字代码点\\ p {Cc} 仅适用于GC = Control。

If you had real control characters instead of indirect representations of them, you would use \pC for all literal code points with the property GC=Other, or \p{Cc} for just GC=Control.

这篇关于控制字符的正则表达式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆