映射平面文本文件 [英] Mapping a flat text file

查看:112
本文介绍了映射平面文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在文本文件中,每行末尾的\\\
检测到行。为此,需要读取整个文件,这对于大文件(例如2GB)来说是一个大问题。我正在寻找一个方法来读取单行而不走遍整个文件(虽然我知道这应该是一个复杂的过程)。

In a text file, lines are detected by \n at the end of each line. For this purpose, it is necessary to read the entire file, and this is a big problem for large files (say 2GB). I am looking for a method to read a single line without walking through the entire file (though I know it should be a complicated process).


  1. 我知道的第一种方式是使用fseek()与偏移;但不实用。

  2. 创建键/值的平面文件;但我不知道是否有一种方法可以避免将整个内存加载到RAM中(应该像在php中读取数组一样)。

  3. 或者,我们可以数字在每行的开始读取。我的意思是,可以通过跳过行内容(到下一行)读取行开头的第一个数字。

  1. The first way I know is to use fseek() with offset; but it is not practical.
  2. Creating a flat file of key/value; but I am not sure if there is a way to avoid loading the entire into RAM (it should be something like reading an array in php).
  3. Alternatively, can we make some numbers at the beginning of each line to be read. I mean, is it possible to read the first digits at the beginning of the line by skipping the line contents (going to the next line).

768| line content is here
769| another line
770| something


如果只读取第一个数字,应该读取的总数据对于大文件来说并不多。

If reading only the first digits, the total data which should be read is not much even for large files.

推荐答案

您需要读取可在行号索引的特定行。如果是这样只是做一个二分搜索。读取(说)文件中间的200个字符以找出行号。然后在两半中重复,直到到达正确的线。

Do you need to read specific lines that can be indexed on line number?. If so just do a binary search. Read (say) 200 characters in the middle of the file to find out a line number. Then repeat in either of the halves until you get to the right line.

这篇关于映射平面文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆