sed 食谱:如何在可以在一行或两行的两个模式之间进行操作? [英] sed recipe: how to do stuff between two patterns that can be either on one line or on two lines?

查看:50
本文介绍了sed 食谱:如何在可以在一行或两行的两个模式之间进行操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们只想在一些模式之间做一些替换,为了清楚起见,让它们成为 ... (好吧,好吧,它们是 startend!...天哪!)

Let's say we want to do some substitutions only between some patterns, let them be <a> and </a> for clarity... (all right, all right, they're start and end!.. Jeez!)

所以我知道如果 startend 总是出现在同一行怎么办:只要设计一个合适的正则表达式.

So I know what to do if start and end always occur on the same line: just design a proper regex.

如果他们保证在不同的行上,我也知道该怎么做,我不关心包含 end 的行中的任何内容,我也可以应用所有命令在包含 start before start 的行中:只需将地址范围指定为 /start/,/end/.

I also know what to do if they're guaranteed to be on different lines and I don't care about anything in the line containing end and I'm also OK with applying all the commands in the line containing start before start: just specify the address range as /start/,/end/.

然而,这听起来不是很有用.如果我需要做一个更聪明的工作,例如,在 {...} 块中引入更改怎么办?

This, however, doesn't sound very useful. What if I need to do a smarter job, for instance, introduce changes inside a {...} block?

我能想到的一件事是在处理之前破坏 {} 上的输入,然后将其重新组合在一起:

One thing I can think of is breaking the input on { and } before processing and putting it back together afterwards:

sed 's/{\|}/\n/g' input | sed 'main stuff' | sed ':a $!{N;ba}; s/\n\(}\|{\)\n/\1/g'

另一个选项正好相反:

cat input | tr '\n' '#' | sed 'whatever; s/#/\n/g'

这两个都很难看,主要是因为操作不限于单个命令.第二个更糟糕,因为假设原始文本中不存在某些字符或子字符串,则必须使用某些字符或子字符串作为换行符".

Both of these are ugly, mainly because the operations are not confined within a single command. The second one is even worse because one has to use some character or substring as a "newline holder" assuming it isn't present in the original text.

那么问题来了:有没有更好的方法或者可以优化上面提到的方法?根据我在最近的 SO 问题中阅读的内容,这是一项非常常规的任务,因此我想一劳永逸地选择最佳实践.

So the question is: are there better ways or can the above-mentioned ones be optimized? This is quite a regular task from what I read in recent SO questions, so I'd like to choose the best practice once and for all.

附言我最感兴趣的是纯 sed 解决方案:这项工作可以只调用一次 sed 而不是别的吗?请不要使用 awkPerl 等:这更像是一个理论问题,而不是需要尽快完成工作"的问题.

P.S. I'm mostly interested in pure sed solutions: can the job be do with one invocation of sed and nothing else? Please no awk, Perl, etc.: this is more of a theoretical question, not a "need the job done asap" one.

推荐答案

这可能对你有用:

# create multiline test data
cat <<\! >/tmp/a
> this
> this { this needs
> changing to
> that } that
> that
> !
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/a
this
this { THIS needs
changing to
THAT } that
that
# convert multiline test data to a single line
tr '\n' ' ' </tmp/a >/tmp/b
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/b
this this { THIS needs changing to THAT } that that

说明:

  • 将数据读入模式空间 (PS)./{/!b;:a;/}/!{$q;N;ba}
  • 将数据复制到保持空间 (HS) 中.h
  • 从字符串的前后剥离非数据.s/[^{]*{//;s/}.*//
  • 转换数据,例如s/this\|that/\U&/g
  • 交换到 HS 并附加转换后的数据.x;G
  • 用转换后的数据替换旧数据.s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/

一个更复杂的答案,我认为它可以满足每行一个以上的块.

A more complicated answer which I think caters for more than one block per line.

# slurp file into pattern space (PS)
:a
$! {
N
ba
}
# check for presence of \v if so quit with exit value 1
/\v/q1
# replace original newlines with \v's
y/\n/\v/
# append a newline to PS as a delimiter
G
# copy PS to hold space (HS)
h
# starting from right to left delete everything but blocks
:b
s/\(.*\)\({.*}\).*\n/\1\n\2/
tb
# delete any non-block details form the start of the file
s/.*\n//
# PS contains only block details
# do any block processing here e.g. uppercase this and that
s/th\(is\|at\)/\U&/g
# append ps to hs
H
# swap to HS
x
# replace each original block with its processed one from right to left
:c
s/\(.*\){.*}\(.*\)\n\n\(.*\)\({.*}\)/\1\n\n\4\2\3/
tc
# delete newlines
s/\n//g
# restore original newlines
y/\v/\n/
# done!

注意这使用 GNU 特定选项,但可以调整以使用通用 sed.

N.B. This uses GNU specific options but could be tweaked to work with generic sed's.

这篇关于sed 食谱:如何在可以在一行或两行的两个模式之间进行操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆