在Shell脚本中使用正则表达式 [英] Using regular expressions in shell script
问题描述
在Linux Shell脚本中使用正则表达式解析字符串的正确方法是什么?我编写了以下脚本,以使用curl
和sed
在控制台上打印我的SO rep(不仅是因为我表示rep -crazy-我试图在切换到linux之前学习一些shell脚本和regex). /p>
What is the correct way to parse a string using regular expressions in a linux shell script? I wrote the following script to print my SO rep on the console using curl
and sed
(not solely because I'm rep-crazy - I'm trying to learn some shell scripting and regex before switching to linux).
json=$(curl -s http://stackoverflow.com/users/flair/165297.json)
echo $json | sed 's/.*"reputation":"\([0-9,]\{1,\}\)".*/\1/' | sed s/,//
但是以某种方式我觉得sed
不是在这里使用的适当工具.我听说grep
与正则表达式有关,并对其进行了一些探讨.但是很明显,只要找到匹配项,它就会打印整行-我正试图从一行文本中提取一个数字.这是我正在处理的字符串的缩小版本(由curl
返回).
But somehow I feel that sed
is not the proper tool to use here. I heard that grep
is all about regex and explored it a bit. But apparently it prints the whole line whenever a match is found - I am trying to extract a number from a single line of text. Here is a downsized version of the string that I'm working on (returned by curl
).
{"displayName":"Amarghosh",声誉":"2,737","badgeHtml":"\ u003cspan title = \" 1个银色徽章\"\ u003e \ u003cspan class = \" badge2 \"\ u003e& #9679; \ u003c/span \ u003e \ u003cspan class = \"badgecount \" \ u003e1 \ u003c/span \ u003e \ u003c/span \ u003e}
{"displayName":"Amarghosh","reputation":"2,737","badgeHtml":"\u003cspan title=\"1 silver badge\"\u003e\u003cspan class=\"badge2\"\u003e●\u003c/span\u003e\u003cspan class=\"badgecount\"\u003e1\u003c/span\u003e\u003c/span\u003e"}
我想我的问题是:
- 在Linux Shell脚本中使用正则表达式解析字符串的正确方法是什么?
-
sed
在这里使用正确吗? - 可以使用
grep
完成吗? - 还有其他更简单/更合适的命令吗?
- What is the correct way to parse a string using regular expressions in a linux shell script?
- Is
sed
the right thing to use here? - Could this be done using
grep
? - Is there any other command that's more easier/appropriate?
推荐答案
grep
命令将从许多行中选择所需的行,但不会直接操纵该行.为此,您在管道中使用sed
:
The grep
command will select the desired line(s) from many but it will not directly manipulate the line. For that, you use sed
in a pipeline:
someCommand | grep 'Amarghosh' | sed -e 's/foo/bar/g'
或者,可以使用awk
(或perl
(如果可用)).在我看来,它是一个比sed
更强大的文本处理工具.
Alternatively, awk
(or perl
if available) can be used. It's a far more powerful text processing tool than sed
in my opinion.
someCommand | awk '/Amarghosh/ { do something }'
对于简单的文本操作,只需坚持使用grep/sed
组合键即可.当您需要更复杂的处理时,请移至awk
或perl
.
For simple text manipulations, just stick with the grep/sed
combo. When you need more complicated processing, move on up to awk
or perl
.
我的第一个想法是只使用:
My first thought is to just use:
echo '{"displayName":"Amarghosh","reputation":"2,737","badgeHtml"'
| sed -e 's/.*tion":"//' -e 's/".*//' -e 's/,//g'
使sed
进程数保持为1(您可以使用-e
给出多个命令).
which keeps the number of sed
processes to one (you can give multiple commands with -e
).
这篇关于在Shell脚本中使用正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!