使用PHP将大文件分解为许多小文件 [英] Break A Large File Into Many Smaller Files With PHP
问题描述
我有一个大约95,000行的209MB .txt文件,该文件每周一次自动推送到我的服务器上,以更新网站上的某些内容.问题是我无法分配足够的内存来处理如此大的文件,因此我想将大文件分解成每个有5,000行的较小文件.
I have a 209MB .txt file with about 95,000 lines that is automatically pushed to my server once a week to update some content on my website. The problem is I cannot allocate enough memory to process such a large file, so I want to break the large file into smaller files with 5,000 lines each.
在文件分解成较小的块之前,我根本无法使用file(),因此我一直在使用SplFileObject.但是我却一无所获.这是我想要完成的一些伪代码:
I cannot use file() at all until the file is broken into smaller pieces, so I have been working with SplFileObject. But I have gotten nowhere with it. Here's some pseudocode of what I want to accomplish:
read the file contents
while there are still lines left to be read in the file
create a new file
write the next 5000 lines to this file
close this file
for each file created
run mysql update queries with the new content
delete all of the files that were created
该文件为csv格式.
这是给出以下答案的按行读取文件的解决方案:
Here is the solution for reading the file by line given the answers below:
function getLine($number) {
global $handle, $index;
$offset = $index[$number];
fseek($handle, $offset);
return explode("|",fgets($handle));
}
$handle = @fopen("content.txt", "r");
while (false !== ($line = fgets($handle))) {
$index[] = ftell($handle);
}
print_r(getLine(18437));
fclose($handle);
推荐答案
如果您的大文件为CSV格式,我想您需要逐行处理它,而实际上并不需要将其分解为较小的文件.不必一次在内存中保留5.000或更多行!为此,只需使用PHP的低级"文件功能:
If your big file is in CSV format, I guess that you need to process it line by line and don't actually need to break it into smaller files. There should be no need to hold 5.000 or more lines in memory at once! To do that, simply use PHP's "low-level" file functions:
$fp = fopen("path/to/file", "r");
while (false !== ($line = fgets($fp))) {
// Process $line, e.g split it into values since it is CSV.
$values = explode(",", $line);
// Do stuff: Run MySQL updates, ...
}
fclose($fp);
如果您需要随机访问,例如逐行读取,您可以为文件创建行索引":
If you need random-access, e.g. read a line by line number, you could create a "line index" for your file:
$fp = fopen("path/to/file", "r");
$index = array(0);
while (false !== ($line = fgets($fp))) {
$index[] = ftell($fp); // get the current byte offset
}
现在$index
将行号映射到字节偏移,您可以使用fseek()
导航到行:
Now $index
maps line numbers to byte offsets and you can navigate to a line by using fseek()
:
function get_line($number)
{
global $fp, $index;
$offset = $index[$number];
fseek($fp, $offset);
return fgets($fp);
}
$line10 = get_line(10);
// ... Once you are done:
fclose($fp);
请注意,与文本编辑器不同,我从0开始计数.
Note that I started line counting at 0, unlike text editors.
这篇关于使用PHP将大文件分解为许多小文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!