处理非常大的csv文件没有超时和内存错误 [英] Process very big csv file without timeout and memory error

查看:182
本文介绍了处理非常大的csv文件没有超时和内存错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在为一个非常大的CSV文件编写一个导入脚本。

At the moment I'm writing an import script for a very big CSV file. The Problem is most times it stops after a while because of an timeout or it throws an memory error.

我的想法现在是解析在100行的CSV文件,因为超时或它引发一个内存错误。步骤和100行后自动调用脚本。我试图实现这与头(位置...)和传递当前行与get但它没有工作,因为我想。

My Idea was now to parse the CSV file in "100 lines" steps and after 100 lines recall the script automatically. I tried to achieve this with header (location ...) and pass the current line with get but it didn't work out as I want to.

有更好的方法,或者有人有一个想法如何摆脱内存错误和超时?

Is there a better way to this or does someone have an idea how to get rid of the memory error and the timeout?

推荐答案

使用 fgetcsv 读取120MB csv在流的方式(这是正确的英语吗?)。它逐行读取,然后我插入每一行到一个数据库。这样,每次迭代只有一行在内存中。脚本还需要20分钟。跑步。也许我下次尝试Python ...不要尝试加载一个巨大的csv文件到数组,这将消耗大量的内存。

I've used fgetcsv to read a 120MB csv in a stream-wise-manner (is that correct english?). That reads in line by line and then I've inserted every line into a database. That way only one line is hold in memory on each iteration. The script still needed 20 min. to run. Maybe I try Python next time… Don't try to load a huge csv-file into an array, that really would consume a lot of memory.

// WDI_GDF_Data.csv (120.4MB) are the World Bank collection of development indicators:
// http://data.worldbank.org/data-catalog/world-development-indicators
if(($handle = fopen('WDI_GDF_Data.csv', 'r')) !== false)
{
    // get the first row, which contains the column-titles (if necessary)
    $header = fgetcsv($handle);

    // loop through the file line-by-line
    while(($data = fgetcsv($handle)) !== false)
    {
        // resort/rewrite data and insert into DB here
        // try to use conditions sparingly here, as those will cause slow-performance

        // I don't know if this is really necessary, but it couldn't harm;
        // see also: http://php.net/manual/en/features.gc.php
        unset($data);
    }
    fclose($handle);
}

这篇关于处理非常大的csv文件没有超时和内存错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆