使用PHP将大文件分解为许多小文件 [英] Break A Large File Into Many Smaller Files With PHP

查看:89
本文介绍了使用PHP将大文件分解为许多小文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大约95,000行的209MB .txt文件,该文件每周一次自动推送到我的服务器上,以更新网站上的某些内容.问题是我无法分配足够的内存来处理如此大的文件,因此我想将大文件分解成每个有5,000行的较小文件.

I have a 209MB .txt file with about 95,000 lines that is automatically pushed to my server once a week to update some content on my website. The problem is I cannot allocate enough memory to process such a large file, so I want to break the large file into smaller files with 5,000 lines each.

在文件分解成较小的块之前,我根本无法使用file(),因此我一直在使用SplFileObject.但是我却一无所获.这是我想要完成的一些伪代码:

I cannot use file() at all until the file is broken into smaller pieces, so I have been working with SplFileObject. But I have gotten nowhere with it. Here's some pseudocode of what I want to accomplish:

read the file contents

while there are still lines left to be read in the file
    create a new file
    write the next 5000 lines to this file
    close this file

for each file created
    run mysql update queries with the new content

delete all of the files that were created

该文件为csv格式.

这是给出以下答案的按行读取文件的解决方案:

Here is the solution for reading the file by line given the answers below:

function getLine($number) {
    global $handle, $index;
    $offset = $index[$number];
    fseek($handle, $offset);
    return explode("|",fgets($handle));
}

$handle = @fopen("content.txt", "r");

while (false !== ($line = fgets($handle))) {
    $index[] = ftell($handle);
}

print_r(getLine(18437));

fclose($handle);

推荐答案

如果您的大文件为CSV格式,我想您需要逐行处理它,而实际上并不需要将其分解为较小的文件.不必一次在内存中保留5.000或更多行!为此,只需使用PHP的低级"文件功能:

If your big file is in CSV format, I guess that you need to process it line by line and don't actually need to break it into smaller files. There should be no need to hold 5.000 or more lines in memory at once! To do that, simply use PHP's "low-level" file functions:

$fp = fopen("path/to/file", "r");

while (false !== ($line = fgets($fp))) {
    // Process $line, e.g split it into values since it is CSV.
    $values = explode(",", $line);

    // Do stuff: Run MySQL updates, ...
}

fclose($fp);

如果您需要随机访问,例如逐行读取,您可以为文件创建行索引":

If you need random-access, e.g. read a line by line number, you could create a "line index" for your file:

$fp = fopen("path/to/file", "r");

$index = array(0);

while (false !== ($line = fgets($fp))) {
    $index[] = ftell($fp);  // get the current byte offset
}

现在$index将行号映射到字节偏移,您可以使用fseek()导航到行:

Now $index maps line numbers to byte offsets and you can navigate to a line by using fseek():

function get_line($number)
{
    global $fp, $index;
    $offset = $index[$number];
    fseek($fp, $offset);
    return fgets($fp);
}

$line10 = get_line(10);

// ... Once you are done:
fclose($fp);

请注意,与文本编辑器不同,我从0开始计数.

Note that I started line counting at 0, unlike text editors.

这篇关于使用PHP将大文件分解为许多小文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆