在shell脚本中使用正则表达式提取子字符串 [英] extract substring using regex in shell script
问题描述
字符串可以是以下形式:
The strings could be of form:
- com.company.$(PRODUCT_NAME:rfc1034identifier)
- $(PRODUCT_BUNDLE_IDENTIFIER)
- com.company.$(PRODUCT_NAME:rfc1034identifier).$(someRandomVariable)
我需要帮助编写提取 $(..) 中所有字符串的正则表达式
I need help in writing regex that extract all the string inside $(..)
我创建了一个像 ([(])\w+([)])
这样的正则表达式,但是当我尝试在 shell 脚本中执行时,它给了我不匹配括号的错误.
I created a regex like ([(])\w+([)])
but when I try to execute in shell script, it gives me error of unmatched parenthesis.
这是我执行的:
echo "com.io.$(sdfsdfdsf)"|grep -P '([(])\w+([)])' -o
我需要获取所有匹配的子字符串.
I need to get all matching substrings.
推荐答案
您的问题指定了shell",而不是bash".因此,我将从一个通用的基于 shell 的工具 (awk) 开始,而不是假设您可以使用任何特定的非 POSIX 内置工具集.
Your question specifies "shell", but not "bash". So I'll start with a common shell-based tool (awk) rather than assuming you can use any particular set of non-POSIX built-ins.
$ cat inp.txt
com.company.$(PRODUCT_NAME:rfc1034identifier)
$(PRODUCT_BUNDLE_IDENTIFIER)
com.company.$(PRODUCT_NAME:rfc1034identifier).$(someRandomVariable)
$ awk -F'[()]' '{for(i=2;i<=NF;i+=2){print $i}}' inp.txt
PRODUCT_NAME:rfc1034identifier
PRODUCT_BUNDLE_IDENTIFIER
PRODUCT_NAME:rfc1034identifier
someRandomVariable
这个 awk 单行定义了一个由左括号或右括号组成的字段分隔符.使用这样的字段分隔符,假设所有输入行的格式正确且其他括号内没有嵌入括号,则每个偶数字段都将是您要查找的内容.
This awk one-liner defines a field separator that consists of opening or closing brackets. With such a field separator, every even-numbered field will be the content you're looking for, assuming all lines of input are correctly formatted and there are no parentheses embedded inside other parentheses.
如果您确实想单独在 POSIX shell 中执行此操作,则可以选择以下选项:
If you did want to do this in POSIX shell alone, the following would be an option:
#!/bin/sh
while read line; do
while expr "$line" : '.*(' >/dev/null; do
line="${line#*(}"
echo "${line%%)*}"
done
done < inp.txt
这将遍历每一行输入,使用括号将其切片并打印每个切片.请注意,这使用了 expr
,它很可能是一个外部二进制文件,但至少包含在 POSIX.1 中.
This steps through each line of input, slicing it up using the parentheses and printing each slice. Note that this uses expr
, which most likely an external binary, but is at least included in POSIX.1.
这篇关于在shell脚本中使用正则表达式提取子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!