使用 SED 删除重复字符串 [英] Removing duplicate strings with SED
问题描述
我使用 buildroot 包将一些软件包移植到一些 Linux 嵌入式系统.一些软件包还生成带有临时目录引用的纯文本脚本和/或库控制文件.在打包分发软件的阶段,有必要删除对暂存目录的引用.使用 SED 删除此类引用没有问题.然而,这种处理留下了一些不需要的重复字符串模式,我摘录如下.我想知道是否可以使用 SED 删除此类重复项.
I use buildroot package to port some software packages to some Linux embedded system. Some software packages also produce plain text script and/or library control files with references to staging directories. It is necessary to remove the references to staging directories at the stage of packaging the software for distribution. I have no problem to use SED to remove such references. However, this processing leaves some undesired patterns of duplicate strings and I excerpted as shown below. I would like to know if it is possible to use SED to remove such duplicates.
注意1:dependency_libs='被排除在外,现在修改如下.我试图简洁地摘录这里需要的内容,之前没有包括 'dependency_libs=' 这里,因为它不包含任何重复项.显然,它在下面的一些建议解决方案中起着重要作用.因此,为了后人,我在这里对其进行了修改.
Note1: The 'dependency_libs=' was left out and is now amended as shown below. I tried to be succinct to excerpt what is needed here and did not include the 'dependency_libs=' here before because it doesn't contain any duplicates. Apparently, it plays an important part on some of suggested solutions below. Therefore, I amended it here for posterity.
注意2:我刚刚发现来自@potong 的 sed 脚本的一个小错误.如果重复的字符串是最后一个没有空格的对象, sed 脚本就会失败.在这种情况下,第 1 'dependency_libs=' 行将部分失败 sed 脚本.第二 'dependency_libs=' 行在行尾(就在单引号之前)包含一个空格,并且通过 sed 脚本没有问题.我在这里修改了它以显示差异.
Note2: I just found out a little bug with the sed scripts from @potong. If the duplicate strings are the last object sans an empty space, the sed scripts fails. In this case, the 1st 'dependency_libs=' line will partially fail the sed scripts. The 2nd 'dependency_libs=' line has included a space at the end of the line (right before the single quote) and passes through the sed scripts without a problem. I have amended it here to show the difference.
cppflags=-I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine
cxxflags=-I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine
Cflags: -I/usr/include -I/include -I/usr/include -I/include -I${includedir}/mine
Libs: -L/usr/lib -L/lib -L/usr/lib -L/lib -L${libdir} -lmine${suffix}
dependency_libs='-L/usr/lib -L/lib -L/usr/lib -L/lib -L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib -L/usr/lib -L/lib -L/usr/lib -L/lib'
dependency_libs='-L/usr/lib -L/lib -L/usr/lib -L/lib -L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib -L/usr/lib -L/lib -L/usr/lib -L/lib '
这样它就会变成:
cppflags=-I/usr/include -I/include -I${includedir}/mine
cxxflags=-I/usr/include -I/include -I${includedir}/mine
Cflags: -I/usr/include -I/include -I${includedir}/mine
Libs: -L/usr/lib -L/lib -L${libdir} -lmine${suffix}
dependency_libs='-L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib'
dependency_libs='-L/usr/lib/libiconv-full/lib -L/usr/lib/libintl-full/lib'
推荐答案
这可能对你有用(GNU sed):
This might work for you (GNU sed):
sed -r ':a;s|((-[IL]/\S+\s).*)\2|\1|;ta' file
这将查找以 -I/
或 -L/
开头的字符串,后跟一个或多个非空格和一个重复的空格并删除第二次出现.如果发生替换,则重复该过程,直到不再发生替换为止.
This looks for strings begining with -I/
or -L/
followed by one or more non-spaces and a space that are repeated and removes the second occurance. If the substitution takes place the process is repeated until no more substitutions occur.
这篇关于使用 SED 删除重复字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!