使用 SED 删除重复字符串 [英] Removing duplicate strings with SED

查看:103
本文介绍了使用 SED 删除重复字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 buildroot 包将一些软件包移植到一些 Linux 嵌入式系统.一些软件包还生成带有临时目录引用的纯文本脚本和/或库控制文件.在打包分发软件的阶段,有必要删除对暂存目录的引用.使用 SED 删除此类引用没有问题.然而,这种处理留下了一些不需要的重复字符串模式,我摘录如下.我想知道是否可以使用 SED 删除此类重复项.

I use buildroot package to port some software packages to some Linux embedded system. Some software packages also produce plain text script and/or library control files with references to staging directories. It is necessary to remove the references to staging directories at the stage of packaging the software for distribution. I have no problem to use SED to remove such references. However, this processing leaves some undesired patterns of duplicate strings and I excerpted as shown below. I would like to know if it is possible to use SED to remove such duplicates.

注意1:dependency_libs='被排除在外,现在修改如下.我试图简洁地摘录这里需要的内容,之前没有包括 'dependency_libs=' 这里,因为它不包含任何重复项.显然,它在下面的一些建议解决方案中起着重要作用.因此,为了后人,我在这里对其进行了修改.

Note1: The 'dependency_libs=' was left out and is now amended as shown below. I tried to be succinct to excerpt what is needed here and did not include the 'dependency_libs=' here before because it doesn't contain any duplicates. Apparently, it plays an important part on some of suggested solutions below. Therefore, I amended it here for posterity.

注意2:我刚刚发现来自@potong 的 sed 脚本的一个小错误.如果重复的字符串是最后一个没有空格的对象, sed 脚本就会失败.在这种情况下,第 1 'dependency_libs=' 行将部分失败 sed 脚本.第二 'dependency_libs=' 行在行尾(就在单引号之前)包含一个空格,并且通过 sed 脚本没有问题.我在这里修改了它以显示差异.

Note2: I just found out a little bug with the sed scripts from @potong. If the duplicate strings are the last object sans an empty space, the sed scripts fails. In this case, the 1st 'dependency_libs=' line will partially fail the sed scripts. The 2nd 'dependency_libs=' line has included a space at the end of the line (right before the single quote) and passes through the sed scripts without a problem. I have amended it here to show the difference.

cppflags=-I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine
cxxflags=-I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine 
Cflags: -I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine 
Libs: -L/usr/lib -L/lib -L/usr/lib -L/lib -L${libdir} -lmine${suffix}
dependency_libs='-L/usr/lib -L/lib -L/usr/lib -L/lib -L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib -L/usr/lib -L/lib -L/usr/lib -L/lib'
dependency_libs='-L/usr/lib -L/lib -L/usr/lib -L/lib -L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib -L/usr/lib -L/lib -L/usr/lib -L/lib '

这样它就会变成:

cppflags=-I/usr/include -I/include -I${includedir}/mine
cxxflags=-I/usr/include -I/include -I${includedir}/mine                        
Cflags: -I/usr/include -I/include -I${includedir}/mine                         
Libs: -L/usr/lib -L/lib -L${libdir} -lmine${suffix}
dependency_libs='-L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib'
dependency_libs='-L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib'

推荐答案

这可能对你有用(GNU sed):

This might work for you (GNU sed):

sed -r ':a;s|((-[IL]/\S+\s).*)\2|\1|;ta' file

这将查找以 -I/-L/ 开头的字符串,后跟一个或多个非空格和一个重复的空格并删除第二次出现.如果发生替换,则重复该过程,直到不再发生替换为止.

This looks for strings begining with -I/ or -L/ followed by one or more non-spaces and a space that are repeated and removes the second occurance. If the substitution takes place the process is repeated until no more substitutions occur.

这篇关于使用 SED 删除重复字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆