巴什正则表达式的if语句 [英] Bash regex if statement

查看:104
本文介绍了巴什正则表达式的if语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我干了什么错在这里做什么?

试图匹配包含空格,小写字母,大写字母或数字的字符串。特殊字符也会更好,但我认为需要转义某些字符。

  TEST =这是一个测试冠军,有些数字12345和特殊字符*&放大器; ^%$#如果[$ TEST=〜[^ A-ZA-Z0-9 \\]]];然后BLAH;科幻

这显然仅用于上,下,数字和空格测试。虽然不能正常工作。

*更新*

我想我本来应该更具体。这里是code的实际实线。

 如果[$ TITLE=〜[^ A-ZA-Z0-9 \\]]];然后返回=FAIL&放大器;&安培; ERROR =错误:标题只能包含大小写字母,数字和空格!;科幻

*更新*

  ./ anm.sh:行265:在条件前pression语法错误
./anm.sh:行265:语法错误附近`&放大器; *#]
./anm.sh:行265:`如果[! $ TITLE=〜[A-ZA-Z0-9 $%^ \\&放大器; *#]]];然后返回=FAIL&放大器;&安培; ERROR =错误:标题只能包含大小写字母,数字和空格!;返回;科幻


解决方案

有几个重要的事情了解的bash的 [[]] 建设。第一:


  

字拆分和路径的扩展不是在 []] 之间的单词进行;波浪线扩展,参数和变量扩展,算术扩展,命令替换,函数替换和引用的去除执行。


第二件事:


  

另外一个二元运算符,'=〜',可用,......该字符串操作者的权利被认为是一个扩展的正前pression,并相应地匹配......模式的任何部分均不得引述迫使它匹配的字符串。


因此​​, $ V 上的 =〜将扩大到该变量的值两侧,但结果不会是字拆分或路径展开。换句话说,这是完全安全离开变量展开不带引号的左侧,但你需要知道变量的扩张将发生在右侧。

所以,如果你写: [$ X =〜[$ 0-9A-ZA-Z]]] $ 1,0 右侧的正则表达式内将正则表达式之前扩大是国际preTED,这可能会导致正则表达式编译失败(除非 $ 1,0 以数字或标点符号的ASCII值小于一个数字)结束。如果你写 [$ X =〜[$ 0-9A-ZA-Z]]] ,那么右侧将被视为普通字符串,不是一个正则表达式(和 $ 1,0 仍然会展开)。你真的想在这种情况下,什么是 [$ X =〜[\\ $ 0-9A-ZA-Z]]]

同样,之间的前pression的 []] 先于拆分成词正则表达式是间preTED。所以在正则表达式的空间需要转义或引用。如果你想匹配的字母,数字或空格,你可以使用: [$ X =〜[0-9A-ZA-Z \\]]] 。其他人物同样需要转义,如,如果没有报价这将开始一个注释。当然,你可以把模式到一个变量:

 拍拍=[0-9A-ZA-Z]
如果[[$ X =〜$拍]];然后 ...

有关其含有大量这将需要进行转义或引用通过庆典的词法分析器来传递,很多人preFER这种风格的人物正则表达式。但要注意:在这种情况下,你的不能的引用变量扩展:

 #这不工作:
如果[[$ X =〜$拍]];然后 ...

最后,我认为你正在试图做的是验证该变量只包含有效的字符。做此项检查最简单的方法是,以确保它不包含无效字符。换句话说,一个前pression是这样的:

 有效='0-9A-ZA-Z $%&放大器;##添加几乎任何你想允许列表
如果[! $ X =〜[^ $有效]]];然后 ...

否定了测试,把它变成一个不匹配运算符和 [^ ...] 正则表达式字符类的意思是比 ... 。

参数扩展和正则表达式运算符的组合,可以使普通的bash前pression语法几乎可读的,但还是有一些陷阱。 (是不是总是存在?)一个是,你不能把] $有效,即使 $有效引述了,除了在开始的时候。 (这是一个Posix的正则表达式规则:如果你要包括] 在字符类,它需要去开头 - 可以在开始或结束,所以如果你需要同时] - ,你需要先从] 和结束与 - ,导致正则表达式我知道我在做什么表情: [] [ - ]

What did I do wrong here?

Trying to match any string that contains spaces, lowercase, uppercase, or numbers. Special characters would be nice too, but I think that requires escaping certain characters.

TEST="THIS is a TEST title with some numbers 12345 and special char *&^%$#"

if [[ "$TEST" =~ [^a-zA-Z0-9\ ] ]]; then BLAH; fi

This obviously only tests for upper, lower, numbers, and spaces. Doesn't work though.

* UPDATE *

I guess I should have been more specific. Here is the actual real line of code.

if [[ "$TITLE" =~ [^a-zA-Z0-9\ ] ]]; then RETURN="FAIL" && ERROR="ERROR: Title can only contain upper and lowercase letters, numbers, and spaces!"; fi

* UPDATE *

./anm.sh: line 265: syntax error in conditional expression
./anm.sh: line 265: syntax error near `&*#]'
./anm.sh: line 265: `  if [[ ! "$TITLE" =~ [a-zA-Z0-9 $%^\&*#] ]]; then RETURN="FAIL" && ERROR="ERROR: Title can only contain upper and lowercase letters, numbers, and spaces!"; return; fi'

解决方案

There are a couple of important things to know about bash's [[ ]] construction. The first:

Word splitting and pathname expansion are not performed on the words between the [[ and ]]; tilde expansion, parameter and variable expansion, arithmetic expansion, command substitution, process substitution, and quote removal are performed.

The second thing:

An additional binary operator, ‘=~’, is available,... the string to the right of the operator is considered an extended regular expression and matched accordingly... Any part of the pattern may be quoted to force it to be matched as a string.

Consequently, $v on either side of the =~ will be expanded to the value of that variable, but the result will not be word-split or pathname-expanded. In other words, it's perfectly safe to leave variable expansions unquoted on the left-hand side, but you need to know that variable expansions will happen on the right-hand side.

So if you write: [[ $x =~ [$0-9a-zA-Z] ]], the $0 inside the regex on the right will be expanded before the regex is interpreted, which will probably cause the regex to fail to compile (unless the expansion of $0 ends with a digit or punctuation symbol whose ascii value is less than a digit). If you write [[ $x =~ "[$0-9a-zA-Z]" ]], then the right-hand side will be treated as an ordinary string, not a regex (and $0 will still be expanded). What you really want in this case is [[ $x =~ [\$0-9a-zA-Z] ]]

Similarly, the expression between the [[ and ]] is split into words before the regex is interpreted. So spaces in the regex need to be escaped or quoted. If you wanted to match letters, digits or spaces you could use: [[ $x =~ [0-9a-zA-Z\ ] ]]. Other characters similarly need to be escaped, like #, which would start a comment if not quoted. Of course, you can put the pattern into a variable:

pat="[0-9a-zA-Z ]"
if [[ $x =~ $pat ]]; then ...

For regexes which contain lots of characters which would need to be escaped or quoted to pass through bash's lexer, many people prefer this style. But beware: In this case, you cannot quote the variable expansion:

# This doesn't work:
if [[ $x =~ "$pat" ]]; then ...

Finally, I think what you are trying to do is verify that the variable only contains valid characters. The easiest way to do this check is to make sure that it does not contain an invalid character. In other words, an expression like this:

valid='0-9a-zA-Z $%&#' # add almost whatever else you want to allow to the list
if [[ ! $x =~ [^$valid] ]]; then ...

! negates the test, turning it into a "does not match" operator, and a [^...] regex character class means "any character other than ...".

The combination of parameter expansion and regex operators can make bash regular expression syntax "almost readable", but there are still some gotchas. (Aren't there always?) One is that you could not put ] into $valid, even if $valid were quoted, except at the very beginning. (That's a Posix regex rule: if you want to include ] in a character class, it needs to go at the beginning. - can go at the beginning or the end, so if you need both ] and -, you need to start with ] and end with -, leading to the regex "I know what I'm doing" emoticon: [][-])

这篇关于巴什正则表达式的if语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆