需要grep第一次出现多个字符串 [英] Need to grep for first occurrences of multiple strings

查看:270
本文介绍了需要grep第一次出现多个字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图返回第一次出现的多个字符串,即我想从下面的文本中选择第一次出现1259,3009和1589的行。

  ADWN 1259 11:00 B23 

ADWN 3009 12:00 B19

DDWN 723 11:30 B04

ADWN 1589 14:20 B12

ADWN 1259 11:10 B23

DDWN 2534 13:00 B16

ADWN 3009 11:50 B14

这给了我所有的匹配:

  grep'1259 \ | 3009 \ | 1589'somelog.log 

这只给我第一场比赛

  grep -m 1'1259\ | 3009\\ \\ | 1589'somelog.log 

我想返回以下内容:

  ADWN 1259 11:00 B23 

ADWN 3009 12:00 B19

ADWN 1589 14:20 B12

我认为创建一个具有所需值的文件,然后循环遍历该文件,将每个数字单独传递给grep命令会给我所寻找的东西,但我还没有找到一个这样的例子。有没有一个简单的解决方案,是一个循环处理这个问题的最好方法,或者这个例子已经在其他地方得到了解答?

感谢您的想法和建议 -



克莱德

解决方案使用 awk

  awk'!array [$ 2] ++&& $ 2〜/ ^ 1259 $ | ^ 3009 $ | ^ 1589 $ /'file.txt 

结果:

  ADWN 1259 11:00 B23 
ADWN 3009 12:00 B19
ADWN 1589 14:20 B12

编辑:



我应该养成先阅读整个问题的习惯。我看到你正在考虑创建一个包含你想要找到的第一个值的文件。将这些文件放在一个名为 values.txt 的文件中,每行一个值。例如;这里是 values.txt 的内容:

  1259 
3009
1589

然后执行此操作:

  awk'FNR == NR {array [$ 0] ++;下一个}阵列{print; $ 2;删除数组[$ 2]}'values.txt file.txt 

结果:

  ADWN 1259 11:00 B23 
ADWN 3009 12:00 B19
ADWN 1589 14:20 B12






第一个命令解释:

如果第二列( $ 2 )等于列出的三个值之一,则将其添加到数组中,如果它尚未存在。 awk 默认输出整行。



第二个命令解释:

FNR 是相对于当前输入文件的记录数量。
NR 是记录的总数。



FNR == NR {...} 构造是唯一的为第一个输入文件。因此,对于 values.txt 中的每一行,我们将整行( $ 0 )添加到数组(I已经把它叫做数组,但你可以给它另一个名字)。 next awk 读取 values.txt中的下一行(并跳过处理命令的其余部分)。当 FNR == NR 不再为真时,会读取参数列表中的第二个文件。然后我们检查数组中的第二列( $ 2 ),如果它在那里,打印并从数组中移除它。通过使用 delete 我们基本上设置了一个最大数量。


I am attempting to return the first occurrence of multiple strings, ie, I want to select the lines from the following text where the first occurrence of 1259, 3009, and 1589 happen.

ADWN    1259    11:00   B23

ADWN    3009    12:00   B19

DDWN     723    11:30   B04

ADWN    1589    14:20   B12

ADWN    1259    11:10   B23

DDWN    2534    13:00   B16

ADWN    3009    11:50   B14

This gives me all matches:

grep '1259\|3009\|1589'  somelog.log

And this gives me only the first match

grep -m 1  '1259\|3009\|1589'  somelog.log

I want to return the following:

ADWN    1259    11:00   B23

ADWN    3009    12:00   B19

ADWN    1589    14:20   B12

I think that creating a file with the required values, and then looping through the file, passing each number individually into the grep command will give me what I am looking for, but I haven't found an example of this. Is there a simple solution for this, is a loop the best way to handle this, or has this example already been answered elsewhere?

Thanks in advance for your ideas and suggestions--

Clyde

解决方案

One way using awk:

awk '!array[$2]++ && $2 ~ /^1259$|^3009$|^1589$/' file.txt

Results:

ADWN    1259    11:00   B23
ADWN    3009    12:00   B19
ADWN    1589    14:20   B12

edit:

I should really get into the habit of reading the whole question first. I see that you're thinking of creating a file with the values you'd like to find the first occurrence of. Put these in a file called values.txt with one value per line. For example; here's the contents of values.txt:

1259
3009
1589

Then run this:

awk 'FNR==NR { array[$0]++; next } $2 in array { print; delete array[$2] }' values.txt file.txt

Results:

ADWN    1259    11:00   B23
ADWN    3009    12:00   B19
ADWN    1589    14:20   B12


1st command explanation:

If the second column ($2) equals one of those three values listed, add it to the array if it's not already in there. awk prints the whole line by default.

2nd command explanation:

FNR is number of records relative to the current input file.
NR is the total number of records.

The FNR==NR { ... } construct is only true for the first input file. So for each of the lines in values.txt, we add the whole line ($0) to an array (I've called it array, but you could give it another name). next forces awk to read the next line in values.txt (and skip processing the rest of the command). When FNR==NR is no longer true, the second file in the arguments list is read. We then check for the second column ($2)in the array, if it's in there, print it and remove it from the array. By using delete we essentially set a max count of one.

这篇关于需要grep第一次出现多个字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆