grep使用来自其他文件的单词匹配行中的特定位置 [英] grep matching specific position in lines using words from other file
问题描述
我有2个文件
file1:
12342015010198765hello
12342015010188765hello
12342015010178765hello
每行包含固定位置的字段,例如,位置 13 - 17
用于 account_id
file2:
98765
88765
其中包含 account_id
s。
在Korn Shell中, strong>我想从file1中打印位于 13 - 17
与file2中的 account_id
之一的行。
我不能这样做
$ p $ grep -f file2 file1
因为file2中的 account_id
可以匹配其他字段
> ^。{12} 98765. *
但没有用。
使用awk
$ awk'NR == FNR {a [$ 1] = 1 ; next;} substr($ 0,13,5)在'file2 file1
12342015010198765hello
12342015010188765hello
工作原理
-
NR == FNR {a [$ 1] = 1; next;}
FNR是从当前文件中读取的行数,NR是读取的总行数远。因此,如果
FNR == NR
,我们读取的第一个文件是file2
。
file2中的每个ID都保存在数组
a
中。然后,我们跳过其余的命令并跳转到下一个
行。 $ c> substr($ 0,13,5)在一个
如果我们达到这个命令,我们正在处理第二个文件 file1
。
如果从位置13开始的5个字符长的子字符串位于数组 A
。如果条件为真,则awk将执行打印该行的默认操作。使用grep 使用grep h2>
您提到尝试
grep'^。{12} 98765. * 'file2
它使用扩展正则表达式语法,这意味着 -E
是必需的。而且,在最后匹配。*
没有任何价值:它总是匹配。因此,试试:
$ grep -E'^。{12} 98765'file1
12342015010198765hello
获得两条线:
$ grep -E'^。{12} [89] 8765'file1
12342015010198765hello
12342015010188765hello
这是可行的,因为 [89] 8765
恰好匹配file2中感兴趣的ID。当然,awk解决方案在匹配什么ID方面提供了更大的灵活性。
I have 2 file
file1:
12342015010198765hello
12342015010188765hello
12342015010178765hello
whose each line contains fields at fixed positions, for example, position 13 - 17
is for account_id
file2:
98765
88765
which contains a list of account_id
s.
In Korn Shell, I want to print lines from file1 whose position 13 - 17
match one of account_id
in file2.
I can't do
grep -f file2 file1
because account_id
in file2 can match other fields at other positions.
I have tried using pattern in file2:
^.{12}98765.*
but did not work.
Using awk
$ awk 'NR==FNR{a[$1]=1;next;} substr($0,13,5) in a' file2 file1
12342015010198765hello
12342015010188765hello
How it works
NR==FNR{a[$1]=1;next;}
FNR is the number of lines read so far from the current file and NR is the total number of lines read so far. Thus, if
FNR==NR
, we are reading the first file which isfile2
.Each ID in in file2 is saved in array
a
. Then, we skip the rest of the commands and jump to thenext
line.substr($0,13,5) in a
If we reach this command, we are working on the second file,
file1
.This condition is true if the 5 character long substring that starts at position 13 is in array
a
. If the condition is true, then awk performs the default action which is to print the line.
Using grep
You mentioned trying
grep '^.{12}98765.*' file2
That uses extended regex syntax which means that -E
is required. Also, there is no value in matching .*
at the end: it will always match. Thus, try:
$ grep -E '^.{12}98765' file1
12342015010198765hello
To get both lines:
$ grep -E '^.{12}[89]8765' file1
12342015010198765hello
12342015010188765hello
This works because [89]8765
just happens to match the IDs of interest in file2. The awk solution, of course, provides more flexibility in what IDs to match.
这篇关于grep使用来自其他文件的单词匹配行中的特定位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!