shell脚本查找,搜索和文件替换字符串数组 [英] Shell script to find, search and replace array of strings in a file
问题描述
这是链接到另外一个问题/ code高尔夫我问的<一个href=\"http://stackoverflow.com/questions/3171552/$c$c-golf-color-highlighting-of-repeated-text\">http://stackoverflow.com/questions/3171552/$c$c-golf-color-highlighting-of-repeated-text
我有一个包含以下内容的文件'sample1.txt':
<$p$p><$c$c>LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.我有一个脚本生成的字符串下面的数组中发生文件(只有少数出于说明)中:
LoremIpsum
LoremIpsu
dummytext
oremIpsum
LoremIps
dummytex
行业
oremIpsu
remIpsum
ummytext
LoremIp
dummyte
emIpsum
INDUSTR
mmytext
我需要(从顶部)看是否LoremIpsum'在文件sample1.txt发生。如果是这样,我想,以取代LoremIpsum出现的所有:&LT; T1&GT; LoremIpsum&LT; / T1&GT;
。现在,该程序移动到下一个单词LoremIpsu的时候,它不应该匹配的&LT; T1&GT; LoremIpsum&LT; / T1&GT;内sample1.txt
文本。它应该重复上述这个数组的所有元素。下一个有效一会是dummytext和应标记为&LT; T2&GT; dummytext&LT; / T2方式&gt;
我认为应该可以创建此一个bash shell脚本的解决方案,而不是依靠的Perl / Python的/ Ruby程序。
纯巴什(无外部)
在bash命令行:
$ sample=\"LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.\"
$#或:样品= $(小于sample1.txt)
$阵列=(
LoremIpsum
LoremIpsu
dummytext
...
)
$标签= 0;在$ {数组[@]}项;做测试=&LT; [^&GT; / *&GT; [^&GT;] * $入门[^&LT;] *&LT; /;如果[! $样品=〜$测试]];然后((标签++));样品= $ {//示例$ {}进入/&LT; T $标记&GT; $输入&LT; / T $标签&GT;};网络连接;完成的;回声输出;回声$样本
输出:
<T1>LoremIpsum</T1>issimply<T2>dummytext</T2>oftheprintingandtypesetting<T3>industry</T3>.<T1>LoremIpsum</T1>hasbeenthe<T3>industry</T3>'sstandard<T2>dummytext</T2>eversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
This is linked to another question/code-golf i asked on http://stackoverflow.com/questions/3171552/code-golf-color-highlighting-of-repeated-text
I've got a file 'sample1.txt' with the following content:
LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
I've got a script generating the following array of strings which occur in the file (only a few shown for illustration):
LoremIpsum
LoremIpsu
dummytext
oremIpsum
LoremIps
dummytex
industry
oremIpsu
remIpsum
ummytext
LoremIp
dummyte
emIpsum
industr
mmytext
I need to (from the top) see if 'LoremIpsum' occurs in file sample1.txt. If so, I want to replace all occurences of LoremIpsum with: <T1>LoremIpsum</T1>
. Now, when the program moves to the next word 'LoremIpsu', it should NOT match against the <T1>LoremIpsum</T1>
text inside sample1.txt. It should repeat the above for all elements of this 'array'. The next 'valid' one would be 'dummytext' and that should be tagged as <T2>dummytext</T2>
.
I do think it should be possible to create a bash shell script solution for this rather than relying on perl/python/ruby programs.
Pure Bash (no externals)
At the Bash command line:
$ sample="LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook."
$ # or: sample=$(<sample1.txt)
$ array=(
LoremIpsum
LoremIpsu
dummytext
...
)
$ tag=0; for entry in ${array[@]}; do test="<[^>/]*>[^>]*$entry[^<]*</"; if [[ ! $sample =~ $test ]]; then ((tag++)); sample=${sample//${entry}/<T$tag>$entry</T$tag>}; fi; done; echo "Output:"; echo $sample
Output:
<T1>LoremIpsum</T1>issimply<T2>dummytext</T2>oftheprintingandtypesetting<T3>industry</T3>.<T1>LoremIpsum</T1>hasbeenthe<T3>industry</T3>'sstandard<T2>dummytext</T2>eversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
这篇关于shell脚本查找,搜索和文件替换字符串数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!