在UNIX中从另一个文件中查找一个文件的内容 [英] Find content of one file from another file in UNIX
问题描述
我有2个文件。第一个文件包含数据库中表的元组的行ID列表。
第二个文件包含SQL查询,这些行ID在查询的where子句中。
例如:
档案1
1610657303
1610658464
1610659169
1610668135
1610668350
1610670407
1610671066
档案2
update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610668350;
update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610672154;
update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610668135;
update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610672153;
我必须读取文件1并在文件2中搜索与行ID匹配的所有SQL命令来自文件1并将这些SQL查询转储到第三个文件中。
文件1有1,00,000个条目,文件2包含文件1条目的10倍,即1,00 ,0000。
我使用了 grep -f File_1 File_2> File_3
。但是,这是非常缓慢的,速度是每小时1000条。
有没有更快的方法来做到这一点?
使用 awk
的一种方法:
awk -v FS =[=]'NR == FNR {rows [$ 1] ++; next}(substr($ NF,1,length($ NF)-1)in rows)'' File1 File2
这应该很快。在我的机器上,花了不到2秒的时间创建了一百万个条目的查询,并将其与300万行进行比较。
机器规格:
Intel (R)至强®CPU E5-2670 0 @ 2.60GHz(8核)
98 GB RAM
I have 2 files. First file contains the list of row ID's of tuples of a table in the database. And second file contains SQL queries with these row ID's in "where" clause of the query.
For example:
File 1
1610657303
1610658464
1610659169
1610668135
1610668350
1610670407
1610671066
File 2
update TABLE_X set ATTRIBUTE_A=87 where ri=1610668350;
update TABLE_X set ATTRIBUTE_A=87 where ri=1610672154;
update TABLE_X set ATTRIBUTE_A=87 where ri=1610668135;
update TABLE_X set ATTRIBUTE_A=87 where ri=1610672153;
I have to read File 1 and search in File 2 for all the SQL commands which matches the row ID's from File 1 and dump those SQL queries in a third file.
File 1 has 1,00,000 entries and File 2 contains 10 times the entries of File 1 i.e. 1,00,0000.
I used grep -f File_1 File_2 > File_3
. But this is extremely slow and the rate is 1000 entries per hour.
Is there any faster way to do this?
One way with awk
:
awk -v FS="[ =]" 'NR==FNR{rows[$1]++;next}(substr($NF,1,length($NF)-1) in rows)' File1 File2
This should be pretty quick. On my machine, it took under 2 seconds to create a lookup of 1 million entries and compare it against 3 million lines.
Machine Specs:
Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (8 cores)
98 GB RAM
这篇关于在UNIX中从另一个文件中查找一个文件的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!