针对多个 sed 替换优化 shell 脚本 [英] Optimize shell script for multiple sed replacements

查看:28
本文介绍了针对多个 sed 替换优化 shell 脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含替换对列表(大约 100 个)的文件,sed 使用这些替换对替换文件中的字符串.

I have a file containing a list of replacement pairs (about 100 of them) which are used by sed to replace strings in files.

配对如下:

old|new
tobereplaced|replacement
(stuffiwant).*(too)|12

我目前的代码是:

cat replacement_list | while read i
do
    old=$(echo "$i" | awk -F'|' '{print $1}')    #due to the need for extended regex
    new=$(echo "$i" | awk -F'|' '{print $2}')
    sed -r "s/`echo "$old"`/`echo "$new"`/g" -i file
done

我不禁想到有一种更优化的替换方式.我试着先把循环转过来运行文件的行,但结果证明要贵得多.

I cannot help but think that there is a more optimal way of performing the replacements. I tried turning the loop around to run through lines of the file first but that turned out to be much more expensive.

有没有其他方法可以加速这个脚本?

Are there any other ways of speeding up this script?

编辑

感谢所有快速回复.在选择答案之前,让我尝试一下各种建议.

Thanks for all the quick responses. Let me try out the various suggestions before choosing an answer.

需要澄清的一件事:我还需要子表达式/组功能.例如,我可能需要的一种替代方法是:

One thing to clear up: I also need subexpressions/groups functionality. For example, one replacement I might need is:

([0-9])U|10  #the extra brackets and escapes were required for my original code

关于改进的一些细节(待更新):

Some details on the improvements (to be updated):

  • 方法:处理时间
  • 原始脚本:0.85s
  • cut 而不是 awk:0.71s
  • anubhava 的方法:0.18s
  • chthonicdaemon 的方法:0.01s
  • Method: processing time
  • Original script: 0.85s
  • cut instead of awk: 0.71s
  • anubhava's method: 0.18s
  • chthonicdaemon's method: 0.01s

推荐答案

您可以使用 sed 生成正确格式的 sed 输入:

You can use sed to produce correctly -formatted sed input:

sed -e 's/^/s|/; s/$/|g/' replacement_list | sed -r -f - file

这篇关于针对多个 sed 替换优化 shell 脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆