匹配引号中的文本(新手) [英] matching text in quotes (newbie)

查看:44
本文介绍了匹配引号中的文本(新手)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我完全迷失在 shell 编程中,主要是因为我使用的每个站点都提供不同的工具来进行模式匹配.所以我的问题是使用什么工具在管道流中进行简单的模式匹配.

I'm getting totally lost in shell programming, mainly because every site I use offers different tool to do pattern matching. So my question is what tool to use to do simple pattern matching in piped stream.

context:我有named.conf 文件,我需要一个简单文件中的所有区域名称以供进一步处理.所以我做 ~$ cat named.local |grep zone 并在这里完全迷路.我的输出是 ~100 个左右的换行符,格式为 'zone "domain.tld" {',我需要用双引号括起来的文本.

context: I have named.conf file, and i need all zones names in a simple file for further processing. So I do ~$ cat named.local | grep zone and get totally lost here. My output is ~hundred or so newlines in form 'zone "domain.tld" {' and I need text in double quotes.

感谢您展示一种方法.

J

推荐答案

我认为您正在寻找的是 sed...这是一个 stream >editor 可让您逐行进行替换.

I think what you're looking for is sed... it's a stream editor which will let you do replacements on a line-by-line basis.

正如你所解释的,命令`cat named.local |grep zone' 给你的输出有点像这样:

As you're explaining it, the command `cat named.local | grep zone' gives you an output a little like this:

zone "domain1.tld" {
zone "domain2.tld" {
zone "domain3.tld" {
zone "domain4.tld" {

我猜你希望输出是这样的,因为你说你需要双引号中的文本:

I'm guessing you want the output to be something like this, since you said you need the text in double quotes:

"domain1.tld"
"domain2.tld"
"domain3.tld"
"domain4.tld"

因此,实际上,从每一行开始,我们只想要双引号之间的文本(包括双引号本身).

So, in reality, from each line we just want the text between the double-quotes (including the double-quotes themselves.)

我不确定您是否熟悉正则表达式,但它们是非常宝贵的工具对于任何编写 shell 脚本的人.例如,正则表达式 /.oe/ 将匹配任何行,其中第 2 个字母是小写的 o,第 4 个字母是 e.这将匹配包含zone"、tone",甚至I amtone-deaf."

I'm not sure you're familiar with Regular Expressions, but they are an invaluable tool for any person writing shell scripts. For example, the regular expression /.o.e/ would match any line where there's a word with the 2nd letter was a lower-case o, and the 4th was e. This would match string containing words like "zone", "tone", or even "I am tone-deaf."

诀窍是使用 .(点)字符来表示任何字母".还有一些其他特殊字符,例如 * 表示重复前一个字符 0 次或更多次".因此,像 a* 这样的正则表达式将匹配 "a"、"aaaaaaa" 或空字符串:""

The trick there was to use the . (dot) character to mean "any letter". There's a couple of other special characters, such as * which means "repeat the previous character 0 or more times". Thus a regular expression like a* would match "a", "aaaaaaa", or an empty string: ""

因此您可以使用以下方法匹配引号内的字符串:/".*"/

So you can match the string inside the quotes using: /".*"/

关于sed,您还应该知道另一件事(通过评论,您已经知道了!) - 它允许回溯.一旦你告诉它如何识别一个词,你就可以让它使用这个词作为替换的一部分.例如,假设您想翻转此列表:

There's another thing you would know about sed (and by the comments, you already do!) - it allows backtracking. Once you've told it how to recognize a word, you can have it use that word as part of the replacement. For example, let's say that you wanted to turn this list:

Billy "The Kid" Smith
Jimmy "The Fish" Stuart
Chuck "The Man" Norris

进入这个列表:

The Kid
The Fish
The Man

首先,您要查找引号内的字符串.我们已经看到了,它是 /".*"/.

First, you'd look for the string inside the quotes. We already saw that, it was /".*"/.

接下来,我们要使用引号内的内容.我们可以使用括号对它进行分组:/"(.*)"/

Next, we want to use what's inside the quotes. We can group it using parens: /"(.*)"/

如果我们想用带下划线的引号替换文本,我们会做一个替换:s/"(.*)"/_/,这样我们就会得到:

If we wanted to replace the text with the quotes with an underscore, we'd do a replace: s/"(.*)"/_/, and that would leave us with:

Billy _ Smith
Jimmy _ Stuart
Chuck _ Norris

但我们有回溯!这将让我们使用符号 \1 回忆括号内的内容.所以如果我们现在这样做: s/"(.*)"/\1/ 我们会得到:

But we have backtracking! That'll let us recall what was inside the parens, using the symbol \1. So if we do now: s/"(.*)"/\1/ we'll get:

Billy The Kid Smith
Jimmy The Fish Stuart
Chuck The Man Norris

因为引号不在括号中,所以它们不是 \1 内容的一部分!

Because the quotes weren't in the parens, they weren't part of the contents of \1!

为了只保留双引号内的内容,我们需要匹配整行.要做到这一点,我们有 ^(意思是行首")和 $(意思是行尾".)

To only leave the stuff inside the double-quotes, we need to match the entire line. To do that we have ^ (which means "beginning of line"), and $ (which means "end of line".)

所以现在如果我们使用 s/^.*"(.*)".*$/\1/,我们会得到:

So now if we use s/^.*"(.*)".*$/\1/, we'll get:

The Kid
The Fish
The Man

为什么?让我们从左到右阅读正则表达式 s/^.*"(.*)".*$/\1/:

Why? Let's read the regular expression s/^.*"(.*)".*$/\1/ from left-to-right:

  • s/ - 开始一个替换正则表达式
  • ^ - 查找行的开头.从那里开始.
  • .* - 继续阅读每个字符,直到...
  • " - ... 直到出现双引号.
  • ( - 开始一组我们可能想在回溯时回忆的字符.
  • .* - 继续阅读每个字符,直到...
  • ) - (pssst!关闭群!)
  • " - ... 直到出现双引号.
  • .* - 继续阅读每个字符,直到...
  • $ - 行尾!

  • s/ - Start a substitution regular expression
  • ^ - Look for the beginning of the line. Start from there.
  • .* - Keep going, reading every character, until...
  • " - ... until you reach a double-quote.
  • ( - start a group a characters we might want to recall later when backtracking.
  • .* - Keep going, reading every character, until...
  • ) - (pssst! close the group!)
  • " - ... until you reach a double-quote.
  • .* - Keep going, reading every character, until...
  • $ - The end of the line!

/ - 使用后面的内容替换匹配的内容

/ - use what's after this to replace what you matched

简单的英语:阅读整行,将双引号之间的文本复制到一边.然后用双引号之间的内容替换整行."

In plain English: "Read the entire line, copying aside the text between the double-quotes. Then replace the entire line with the content between the double qoutes."

您甚至可以在替换文本 s/^.*"(.*)".*$/"\1"/ 周围添加双引号,这样我们将得到:>

You can even add double-quote around the replacing text s/^.*"(.*)".*$/"\1"/, so we'll get:

"The Kid"
"The Fish"
"The Man"

sed 可以使用它来用引号内的内容替换该行:

And that can be used by sed to replace the line with the content from within the quotes:

sed -e "s/^.*\"\(.*\)\".*$/\"\1\"/"

(这只是 shell 转义以处理双引号和斜杠之类的东西.)

(This is just shell-escaped to deal with the double-quotes and slashes and stuff.)

所以整个命令应该是这样的:

So the whole command would be something like:

cat named.local | grep zone | sed -e "s/^.*\"\(.*\)\".*$/\"\1\"/"

这篇关于匹配引号中的文本(新手)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆