高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT; [英] Advanced `uniq` with "unique part regex"

查看：179 发布时间：2016/7/28 16:53:02 regex linux shell awk uniq

本文介绍了高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT;的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

uniq的是一个工具，使曾经在一个文件过滤行，使得只有独特的线条所示。 uniq的有一定的支撑，指定当两行是等价，但选择是有限的。

我在寻找的 uniq的工具/扩展，它允许一个进入一个正则表达式。如果所捕获的组是两行相同，则两行被认为是对等。只有第一场比赛将返回每个等价类。

示例

FILE.DAT ：

 富！吧！巴兹
！巴兹！quix
！吧！FOOBAR
ID！巴兹！

使用的grep -P'（\\ W +！）-o ，可以提取独特的部分

 ！吧！
！巴兹！
！酒吧！
！巴兹！

这意味着，第一行被认为是对等与第三和第二与第四。因此，只有在第一和第二印刷（第三和第四被忽略）。

然后 uniq的'（\\ W +！）'＆LT; FILE.DAT 应返回：

 富！吧！巴兹
！巴兹！quix

解决方案

不使用 uniq的，但使用的GNU AWK你可以得到你想要的结果：

 的awk -v重新=[[：alnum：]]！+' 比赛（$ 0重，a）及;＆安培; ！（一个[0]ρ）{对〔一个[0]];打印}'文件
富！吧！巴兹
！巴兹！quix

使用命令行变量传递所需的正则表达式 -v重= ...

匹配功能正则表达式匹配每一行和报酬匹配的文本[A]

每次匹配成功，我们存储在一个关联数组匹配的文本 P 和打印

从而有效获得 uniq的使用功能正则表达式支持

uniq is a tool that enables once to filter lines in a file such that only unique lines are shown. uniq has some support to specify when two lines are "equivalent", but the options are limited.

I'm looking for a tool/extension on uniq that allows one to enter a regex. If the captured group is the same for two lines, then the two lines are considered "equivalent". Only the "first match" is returned for each equivalence class.

Example:

file.dat:

foo!bar!baz
!baz!quix
!bar!foobar
ID!baz!

Using grep -P '(!\w+!)' -o, one can extract the "unique parts":

!bar!
!baz!
!bar!
!baz!

This means that the first line is considered to be "equivalent" with the third and the second with the fourth. Thus only the first and the second are printed (the third and fourth are ignored).

Then uniq '(!\w+!)' < file.dat should return:

foo!bar!baz
!baz!quix

解决方案

Not using uniq but using gnu-awk you can get the results you want:

awk -v re='![[:alnum:]]+!' 'match($0, re, a) && !(a[0] in p) {p[a[0]]; print}' file
foo!bar!baz
!baz!quix

Passing required regex using a command line variable -v re=...
match function matches regex for each line and returns matched text in [a]
Every time match succeeds we store matched text in an associative array p and print
Thus effectively getting uniq function with regex support

这篇关于高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT;的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT; [英] Advanced `uniq` with "unique part regex"

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT; [英] Advanced `uniq` with &quot;unique part regex&quot;

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

高级`uniq`与＆QUOT;独有部分的正则表达式＆QUOT; [英] Advanced `uniq` with "unique part regex"

登录关闭