sed的:用匹配的模式替换第n个单词? [英] sed: replacing nth word with matched pattern?
问题描述
我有一个文本文件具有以下特征:
- 每一行都有用空格隔开至少有三个字
- 一个字可以是任何字符或字符串
我已经附加了一些注意事项,以一些带有试探性的建议线路的更改必须作出的原话,现在想用SED使这些变化对我来说。因此,给予更清晰的画面,我的文件是这样的:
NO无O
SIGNS NNØ#NNS
GIVEN VBD B-VP #VBN
AT邻
本NNØ
TIME NNØ## B-NP
。 ØPER
...
1#注与2#的,以取代在一条线上的第二个字和音符,以取代在一条线上的第三个字。会有人能够提出一种用SED(或awk的,或其他任何东西),要做到这一点?再次澄清(希望),我的目标是让该模式下的#或##和匹配的模式替换该行的第n个字。
感谢。
这会为你工作:
的awk'/#/ {子(/#+ /,#); N = GSUB(/#/,,$ NF); $(N + 1 )= $ NF; $ NF =\\ t \\ t#} 1'文件
说明
-
/#/ {...}
:搜索包含行#
然后执行以下步骤。 .. -
子(/#+ /,#)
:删除笔记和#
之间的所有空间如果有必要 -
N = GSUB(/#/,,$ NF)
:删除所有#
从上字段$ NF
并设置#
数的拆下来的变量N
-
$(N + 1)= $ NF
:设置n + 1个字段$(N + 1)
到新的最后一个字段$ NF
里面有所有的#
扒掉 -
$ NF =\\ t \\ t#
:将最后一个字段$ NF
来两个标签后面由#
-
1
:快捷键告诉AWK
打印改变行 -
文件
:您输入文件
示例
$ awk的'/#/ {子(/#+ /,#); N = GSUB(/#/,,$ NF); $(N + 1)= $ NF; $ NF =\\ t \\ t#} 1'文件
NO无O
SIGNS NNS○△
GIVEN VBN B-VP#
AT邻
本NNØ
TIME NN B-NP#
。 ØPER
...
的 注意的:如果你做起来很笔记始终遵循#
在,你们之间的零空间可将整个子(/#+ /,#);
命令的一部分,使其更短。
I have a text file with the following characteristics:
- every line has at least three "words" separated by a space
- a "word" can be any character or string of characters
I have appended some notes to some of the lines with tentative suggestions for changes to be made to the original words, and now would like to use sed to make those changes for me. So, to give a clearer picture, my file looks like this:
NO NO O
SIGNS NN O #NNS
GIVEN VBD B-VP #VBN
AT IN O
THIS NN O
TIME NN O ## B-NP
. PER O
...
Notes with 1 # are to replace the SECOND word in a line, and notes with 2 #'s are to replace the THIRD word in a line. Would anybody be able to suggest a way to do this with sed (or awk, or anything else)? Again to clarify (hopefully), my goal is to get the pattern following the # or ## and replace the nth word of the line with the matched pattern.
Thanks.
This will work for you:
awk '/#/{sub(/# +/,"#");n=gsub(/#/,"",$NF);$(n+1)=$NF;$NF="\t\t#"}1' file
Explanation
/#/{ ... }
: Search for lines that contain#
and perform the following steps...sub(/# +/,"#")
: Remove all spaces between the notes and the#
if necessaryn=gsub(/#/,"",$NF)
: Remove all#
from the last field$NF
and set the number of#
's removed to the variablen
$(n+1)=$NF
: Set the n+1 field$(n+1)
to the new last field$NF
which has all the#
stripped off$NF="\t\t#"
: Set the last field$NF
to two tabs followed by a#
1
: Shortcut to tellawk
to print the altered linefile
: Your input file
Example
$ awk '/#/{sub(/# +/,"#");n=gsub(/#/,"",$NF);$(n+1)=$NF;$NF="\t\t#"}1' file
NO NO O
SIGNS NNS O #
GIVEN VBN B-VP #
AT IN O
THIS NN O
TIME NN B-NP #
. PER O
...
Note: If you make it so your notes always following the #
with zero spaces in between, you can remove the entire sub(/# +/,"#");
part of the command to make it even shorter
这篇关于sed的:用匹配的模式替换第n个单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!