无法弄清楚如何使用 AppleScript 实现 REGEX [英] Can't figure out how to implement REGEX with AppleScript
问题描述
我写了一个正则表达式命令来查找和输出字符串中一行数字的第一个实例:
I wrote a regex command to find and output the first instance of a line of digits in a string:
find: ^[^\d]*(\d+).*
replace: $1
问题是,为了在 AppleScript 中实际使用它,我知道这样做的唯一方法是调用 shell 脚本并使用 sed.我不知道如何以这种方式实际使用我的正则表达式.我已经尝试了几个小时但没有任何运气.这是我能得到的最接近的结果,但它返回字符串中的所有数字,而不是第一组数字:
The problem is that in order to actually utilize this in AppleScript, the only way I know of doing this is with calling a shell script and using sed. I can't figure out how to actually use my regex in this way. I've tried for hours without any luck. This is as close as I can get, but it returns ALL the numbers in a string, rather than the first group of numbers:
set num to do shell script "sed 's/[^0-9]*//g' <<< " & quoted form of input
我真正想要的是一种使用 AppleScript 来处理正则表达式并找到匹配替换($1、$2 等)的方法.
What I would really like is a way to use AppleScript to just WORK with regex and found match replacement ($1, $2, etc).
推荐答案
注意 sed
不支持像 \d
这样的 PCRE 速记字符类,也不支持正则表达式在括号表达式中转义.
Note that sed
does not support PCRE shorthand character classes like \d
, nor does it support regex escapes inside bracket expressions.
此外,由于您使用 sed
的 POSIX BRE 风格(未使用 -r
或 -E
选项),来定义捕获组,您需要 \(...\)
,而不是 (...)
.
Also, since you use POSIX BRE flavor of sed
(no -r
or -E
option is used), to define a capturing group, you need \(...\)
, not (...)
.
此外,+
与 POSIX BRE 模式中的文字 +
符号匹配,您需要对其进行转义,但为了安全起见,您可以扩展 a+
到 aa*
.
Also, a +
is matching a literal +
symbol in POSIX BRE pattern, you need to escape it, but to play it safe, you can just expand a+
to aa*
.
sed
中的替换反向引用语法是 \
+ 数字.
Replacement backreference syntax in sed
is \
+ number.
使用此 POSIX BRE 解决方案:
Use this POSIX BRE solution:
sed 's/^[^0-9]*\([0-9][0-9]*\).*/\1/'
或者,如果您使用 -E
或 -r
选项,POSIX ERE 解决方案:
or, if you use -E
or -r
option, a POSIX ERE solution:
sed -E 's/^[^0-9]*([0-9]+).*/\1/'
详情
^
- 字符串的开始[^0-9]*
- 0+ 个数字以外的字符(也可以使用[[:digit:]]*
)\(
- 捕获组 #1 的开始(使用替换模式中的\1
占位符引用)(在 ERE 中,(
code> 将启动一个捕获组)[0-9][0-9]*
=[0-9]\+
(BRE) =[0-9]+
(ERE) - 1+ 位数\)
- 捕获组的结束(在 POSIX ERE 中,)
).*
- 该行的其余部分.
^
- start of string[^0-9]*
- 0+ chars other than digits (also, you may use[[:digit:]]*
)\(
- start of a capturing group #1 (referred to with the\1
placeholder from the replacement pattern) (in ERE,(
will start a capturing group)[0-9][0-9]*
=[0-9]\+
(BRE) =[0-9]+
(ERE) - 1+ digits\)
- end of the capturing group (in POSIX ERE,)
).*
- the rest of the line.
这篇关于无法弄清楚如何使用 AppleScript 实现 REGEX的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!