有效地计算文本文件的行数。 (200MB +) [英] Efficiently counting the number of lines of a text file. (200mb+)

查看:155
本文介绍了有效地计算文本文件的行数。 (200MB +)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚发现我的脚本给我一个致命的错误:

I have just found out that my script gives me a fatal error:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 440 bytes) in C:\process_txt.php on line 109

该行是这样的:

$lines = count(file($path)) - 1;

所以我认为将文件加载到memeory并计算行数是有困难的,有没有一个更有效的方式,我可以做到这一点没有内存问题?

So I think it is having difficulty loading the file into memeory and counting the number of lines, is there a more efficient way I can do this without having memory issues?

我需要计数的行数为2MB到500MB的文本文件。可能有时候是一个ig。

The text files that I need to count the number of lines for range from 2MB to 500MB. Maybe a Gig sometimes.

感谢所有的帮助。

推荐答案

这将使用更少的内存,因为它不会将整个文件加载到内存中:

This will use less memory, since it doesn't load the whole file into memory:

$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
  $line = fgets($handle);
  $linecount++;
}

fclose($handle);

echo $linecount;

fgets 将一行添加到内存中(如果第二个参数 $ length 被忽略,它将继续从流中读取,直到它到达行尾,这是我们想要的)。如果您关心墙壁时间以及内存使用情况,这仍不可能像使用PHP以外的其他东西一样快。

fgets loads a single line into memory (if the second argument $length is omitted it will keep reading from the stream until it reaches the end of the line, which is what we want). This is still unlikely to be as quick as using something other than PHP, if you care about wall time as well as memory usage.

唯一的危险是如果有的话线条特别长(如果您遇到没有换行符的2GB文件怎么办?)。在这种情况下,你最好在大块中排除它,并计算行尾字符:

The only danger with this is if any lines are particularly long (what if you encounter a 2GB file without line breaks?). In which case you're better off doing slurping it in in chunks, and counting end-of-line characters:

$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
  $line = fgets($handle, 4096);
  $linecount = $linecount + substr_count($line, PHP_EOL);
}

fclose($handle);

echo $linecount;

这篇关于有效地计算文本文件的行数。 (200MB +)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆