如何从文本块中删除受骗者 [英] How to remove dupes from blocks of text

查看：129 发布时间：2016/7/29 11:17:45 ruby perl awk sed

本文介绍了如何从文本块中删除受骗者的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

什么是文本文件中的块中删除愚弄一个聪明而简单的方法。每个模块由两个换行分隔。

在

 苹果
香蕉
苹果
樱桃
樱桃三角洲
EPSILON
三角洲
EPSILON苹果派
三角洲
三角洲

在

 苹果
香蕉
樱桃三角洲
EPSILON苹果派
三角洲

感谢。如果在Mac上运行。允许单code。任何shell方法/语言/命令。愚弄不一定连续。奖金，如果你忽略了前/后的空白，或者可以用逗号作为记录中的分隔符。

解决方案

  $ awk的'！NF {删除见过}！可见[$ 0] ++'文件
苹果
香蕉
樱桃三角洲
EPSILON苹果派
三角洲

要忽略（而不是删除）领先/和GNU AWK为gensub（）结尾的空白将是：

  $ awk的'！NF {删除看到}看到[gensub（/ ^ \\ S + | \\ s + $ /，，G）] ++'文件

我不知道你用的意思可以用在这种情况下记录中逗号作为分隔符。

What's a smart and easy way to remove dupes within blocks within a file of text. Each block is separated by two newlines.

BEFORE:

apple
banana
apple
cherry
cherry

delta
epsilon
delta
epsilon

apple pie
delta
delta

AFTER:

apple
banana
cherry

delta
epsilon

apple pie
delta

Thanks. Should work on a Mac. Allow unicode. Any shell method/language/command. Dupes are not necessarily consecutive. Bonus if you ignore leading/trailing whitespace, or can use a comma as the delimiter within a record.

解决方案

$ awk '!NF{delete seen} !seen[$0]++' file
apple
banana
cherry

delta
epsilon

apple pie
delta

To ignore (as opposed to remove) leading/trailing white space with GNU awk for gensub() would be:

$ awk '!NF{delete seen} !seen[gensub(/^\s+|\s+$/,"","g")]++' file

I've no idea what you mean by can use a comma as the delimiter within a record in this context.

这篇关于如何从文本块中删除受骗者的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从文本块中删除受骗者 [英] How to remove dupes from blocks of text

问题描述

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

如何从文本块中删除受骗者 [英] How to remove dupes from blocks of text

问题描述

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭