处理非常大的csv文件而没有超时和内存错误 [英] Process very big csv file without timeout and memory error

查看:27
本文介绍了处理非常大的csv文件而没有超时和内存错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在为一个非常大的 CSV 文件编写一个导入脚本.问题是大多数时候它会因超时或引发内存错误而在一段时间后停止.

At the moment I'm writing an import script for a very big CSV file. The Problem is most times it stops after a while because of an timeout or it throws an memory error.

我现在的想法是以100 行"的步骤解析 CSV 文件,并在 100 行后自动调用脚本.我试图用标题(位置...)来实现这一点,并用 get 传递当前行,但它没有像我想要的那样工作.

My Idea was now to parse the CSV file in "100 lines" steps and after 100 lines recall the script automatically. I tried to achieve this with header (location ...) and pass the current line with get but it didn't work out as I want to.

有没有更好的方法来解决这个问题,或者有人知道如何摆脱内存错误和超时?

Is there a better way to this or does someone have an idea how to get rid of the memory error and the timeout?

推荐答案

我用过 fgetcsv 以流式方式读取 120MB csv(这是正确的英语吗?).它逐行读取,然后我将每一行都插入到数据库中.这样,每次迭代时,内存中只保留一行.脚本仍然需要 20 分钟.跑步.也许下次我会尝试 Python……不要尝试将一个巨大的 csv 文件加载到一个数组中,这真的会消耗大量内存.

I've used fgetcsv to read a 120MB csv in a stream-wise-manner (is that correct english?). That reads in line by line and then I've inserted every line into a database. That way only one line is hold in memory on each iteration. The script still needed 20 min. to run. Maybe I try Python next time… Don't try to load a huge csv-file into an array, that really would consume a lot of memory.

// WDI_GDF_Data.csv (120.4MB) are the World Bank collection of development indicators:
// http://data.worldbank.org/data-catalog/world-development-indicators
if(($handle = fopen('WDI_GDF_Data.csv', 'r')) !== false)
{
    // get the first row, which contains the column-titles (if necessary)
    $header = fgetcsv($handle);

    // loop through the file line-by-line
    while(($data = fgetcsv($handle)) !== false)
    {
        // resort/rewrite data and insert into DB here
        // try to use conditions sparingly here, as those will cause slow-performance

        // I don't know if this is really necessary, but it couldn't harm;
        // see also: http://php.net/manual/en/features.gc.php
        unset($data);
    }
    fclose($handle);
}

这篇关于处理非常大的csv文件而没有超时和内存错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆