修改文本文件而不读入内存 [英] Modifying a text file without reading into memory

查看:55
本文介绍了修改文本文件而不读入内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找出一种修改文本文件(特别是删除特定行)的方法,而无需将文件的大部分读入内存或重写整个文件.这里谈论的是大于主存的文件,大约15-50 Gig.

I was trying to figure out a way to modify a text file (specially deleting specific lines) without reading a big part of file into memory or rewriting the whole file. Here am talking about files larger than main memory about 15-50 Gigs.

P.S.我正在使用Linux.

P.S. I am using Linux.

推荐答案

您将无法制作新文件,因此只需硬着头皮做就可以了.将 grep 与适当的选项一起使用,并将结果通过管道传输到第二个文件:

You aren't going to get around making a new file, so just bite the bullet and do it. Use grep with appropriate options and pipe the result to a second file:

$ grep -fv patternsToExcludeFromInput input > output

另一种方法是将模式放入哈希表(Perl),字典(Python)或 unordered_map (C ++)中,然后将输入文件的每一行处理为寻找比赛.

Another approach is to put patterns into, as examples, a hash table (Perl), a dictionary (Python), or an unordered_map (C++), and process each line of your input file to look for matches.

如果没有匹配项,则将该行打印到标准输出流(可以通过管道将其输出到常规文件).您的内存使用情况将主要限于哈希表和您要查询的输入行.

If there is no match, print the line to the standard output stream (which you can pipe to a regular file). Your memory usage will be limited mostly to the hash table and the line of input you are querying.

这篇关于修改文本文件而不读入内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆