针对多个 sed 替换优化 shell 脚本 [英] Optimize shell script for multiple sed replacements
问题描述
我有一个包含替换对列表(大约 100 个)的文件,sed
使用这些替换对替换文件中的字符串.
I have a file containing a list of replacement pairs (about 100 of them) which are used by sed
to replace strings in files.
配对如下:
old|new
tobereplaced|replacement
(stuffiwant).*(too)|12
我目前的代码是:
cat replacement_list | while read i
do
old=$(echo "$i" | awk -F'|' '{print $1}') #due to the need for extended regex
new=$(echo "$i" | awk -F'|' '{print $2}')
sed -r "s/`echo "$old"`/`echo "$new"`/g" -i file
done
我不禁想到有一种更优化的替换方式.我试着先把循环转过来运行文件的行,但结果证明要贵得多.
I cannot help but think that there is a more optimal way of performing the replacements. I tried turning the loop around to run through lines of the file first but that turned out to be much more expensive.
有没有其他方法可以加速这个脚本?
Are there any other ways of speeding up this script?
编辑
感谢所有快速回复.在选择答案之前,让我尝试一下各种建议.
Thanks for all the quick responses. Let me try out the various suggestions before choosing an answer.
需要澄清的一件事:我还需要子表达式/组功能.例如,我可能需要的一种替代方法是:
One thing to clear up: I also need subexpressions/groups functionality. For example, one replacement I might need is:
([0-9])U|10 #the extra brackets and escapes were required for my original code
关于改进的一些细节(待更新):
Some details on the improvements (to be updated):
- 方法:处理时间
- 原始脚本:0.85s
cut
而不是awk
:0.71s- anubhava 的方法:0.18s
- chthonicdaemon 的方法:0.01s
- Method: processing time
- Original script: 0.85s
cut
instead ofawk
: 0.71s- anubhava's method: 0.18s
- chthonicdaemon's method: 0.01s
推荐答案
您可以使用 sed
生成正确格式的 sed
输入:
You can use sed
to produce correctly -formatted sed
input:
sed -e 's/^/s|/; s/$/|g/' replacement_list | sed -r -f - file
这篇关于针对多个 sed 替换优化 shell 脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!