在UNIX中从另一个文件中查找一个文件的内容 [英] Find content of one file from another file in UNIX

查看：97 发布时间：2018/5/28 19:17:09 file grep

本文介绍了在UNIX中从另一个文件中查找一个文件的内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有2个文件。第一个文件包含数据库中表的元组的行ID列表。
第二个文件包含SQL查询，这些行ID在查询的where子句中。

例如：

档案1

档案2

  update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610668350; 
 update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610672154; 
 update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610668135; 
 update TABLE_X set ATTRIBUTE_A = 87其中ri = 1610672153;

我必须读取文件1并在文件2中搜索与行ID匹配的所有SQL命令来自文件1并将这些SQL查询转储到第三个文件中。

文件1有1,00,000个条目，文件2包含文件1条目的10倍，即1,00 ，0000。

我使用了 grep -f File_1 File_2> File_3 。但是，这是非常缓慢的，速度是每小时1000条。

有没有更快的方法来做到这一点？

解决方案
使用 awk 的一种方法：

awk -v FS =[=]'NR == FNR {rows [$ 1] ++; next}（substr（$ NF，1，length（$ NF）-1）in rows）'' File1 File2
这应该很快。在我的机器上，花了不到2秒的时间创建了一百万个条目的查询，并将其与300万行进行比较。

机器规格：

Intel （R）至强®CPU E5-2670 0 @ 2.60GHz（8核） 98 GB RAM

I have 2 files. First file contains the list of row ID's of tuples of a table in the database. And second file contains SQL queries with these row ID's in "where" clause of the query.

For example:

File 1
1610657303 1610658464 1610659169 1610668135 1610668350 1610670407 1610671066
File 2
update TABLE_X set ATTRIBUTE_A=87 where ri=1610668350; update TABLE_X set ATTRIBUTE_A=87 where ri=1610672154; update TABLE_X set ATTRIBUTE_A=87 where ri=1610668135; update TABLE_X set ATTRIBUTE_A=87 where ri=1610672153;
I have to read File 1 and search in File 2 for all the SQL commands which matches the row ID's from File 1 and dump those SQL queries in a third file.

File 1 has 1,00,000 entries and File 2 contains 10 times the entries of File 1 i.e. 1,00,0000.

I used grep -f File_1 File_2 > File_3. But this is extremely slow and the rate is 1000 entries per hour.

Is there any faster way to do this?
解决方案
One way with awk:
awk -v FS="[ =]" 'NR==FNR{rows[$1]++;next}(substr($NF,1,length($NF)-1) in rows)' File1 File2
This should be pretty quick. On my machine, it took under 2 seconds to create a lookup of 1 million entries and compare it against 3 million lines.

Machine Specs:
Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (8 cores) 98 GB RAM

这篇关于在UNIX中从另一个文件中查找一个文件的内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在UNIX中从另一个文件中查找一个文件的内容 [英] Find content of one file from another file in UNIX

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在UNIX中从另一个文件中查找一个文件的内容 [英] Find content of one file from another file in UNIX

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭