PHPExcel用尽了256、512和1024MB的RAM [英] PHPExcel runs out of 256, 512 and also 1024MB of RAM

查看:101
本文介绍了PHPExcel用尽了256、512和1024MB的RAM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不明白. XSLX表的大小约为3MB,但即使1024MB的RAM也不足以使PHPExcel将该表加载到内存中?

I don't understand it. The XSLX table is about 3MB large yet even 1024MB of RAM is not enough for PHPExcel to load it into memory?

我在这里可能做错了什么

I might be doing something horribly wrong here:

function ReadXlsxTableIntoArray($theFilePath)
{
    require_once('PHPExcel/Classes/PHPExcel.php');
    $inputFileType = 'Excel2007';
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    $objReader->setReadDataOnly(true);
    $objPHPExcel = $objReader->load($theFilePath);
    $rowIterator = $objPHPExcel->getActiveSheet()->getRowIterator();
    $arrayData = $arrayOriginalColumnNames = $arrayColumnNames = array();
    foreach($rowIterator as $row){
        $cellIterator = $row->getCellIterator();
        $cellIterator->setIterateOnlyExistingCells(false); // Loop all cells, even if it is not set
        if(1 == $row->getRowIndex ()) {
            foreach ($cellIterator as $cell) {
                $value = $cell->getCalculatedValue();
                $arrayOriginalColumnNames[] = $value;
                // let's remove the diacritique
                $value = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value);
                // and white spaces
                $valueExploded = explode(' ', $value);
                $value = '';
                // capitalize the first letter of each word
                foreach ($valueExploded as $word) {
                    $value .= ucfirst($word);
                }
                $arrayColumnNames[] = $value;
            }
            continue;
        } else {
            $rowIndex = $row->getRowIndex();
            reset($arrayColumnNames);
            foreach ($cellIterator as $cell) {
                $arrayData[$rowIndex][current($arrayColumnNames)] = $cell->getCalculatedValue();
                next($arrayColumnNames);
            }
        }
    }
    return array($arrayOriginalColumnNames, $arrayColumnNames, $arrayData);
}

上面的函数将数据从excel表读取到数组.

The function above reads data from an excel table to an array.

有什么建议吗?

首先,我允许PHP使用256MB的RAM.这还不够.然后,我将容量增加了一倍,然后又尝试了1024MB.它仍然会因以下错误而用完内存:

At first, I allowed PHP to use 256MB of RAM. It was not enough. I then doubled the amount and then also tried 1024MB. It still runs out of memory with this error:

Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

Fatal error (shutdown): Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

推荐答案

在PHPExcel论坛上已经有很多关于PHPExcel内存使用的文章.因此通读之前的一些讨论可能会给您一些想法. PHPExcel拥有电子表格的内存中"表示形式,并且容易受到PHP内存限制的影响.

There's plenty been written about the memory usage of PHPExcel on the PHPExcel forum; so reading through some of those previous discussions might give you a few ideas. PHPExcel holds an "in memory" representation of a spreadsheet, and is susceptible to PHP memory limitations.

文件的物理大小在很大程度上无关紧要...知道文件包含多少个单元格(每个工作表上的行*列)更为重要.

The physical size of the file is largely irrelevant... it's much more important to know how many cells (rows*columns on each worksheet) it contains.

我一直使用的经验法则"平均约为每单元1k,因此5M单元工作簿将需要5GB内存.但是,可以通过多种方法来减少该需求.可以将这些组合在一起,具体取决于您需要在工作簿中访问哪些信息以及您想使用该信息做什么.

The "rule of thumb" that I've always used is an average of about 1k/cell, so a 5M cell workbook is going to require 5GB of memory. However, there are a number of ways that you can reduce that requirement. These can be combined, depending on exactly what information you need to access within your workbook, and what you want to do with it.

如果您有多个工作表,但不需要全部加载,则可以使用setLoadSheetsOnly()方法限制Reader将加载的工作表. 加载单个命名的工作表:

If you have multiple worksheets, but don't need to load all of them, then you can limit the worksheets that the Reader will load using the setLoadSheetsOnly() method. To load a single named worksheet:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2'; 
/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Advise the Reader of which WorkSheets we want to load  **/ 
$objReader->setLoadSheetsOnly($sheetname); 
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

或者您可以通过传递一组名称来一次调用setLoadSheetsOnly()来指定多个工作表:

Or you can specify several worksheets with one call to setLoadSheetsOnly() by passing an array of names:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3'); 
/** Create a new Reader of the type defined in $inputFileType **/ 
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/ 
$objReader->setLoadSheetsOnly($sheetnames); 
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

如果您只需要访问工作表的一部分,则可以定义一个读取过滤器"来识别您实际要加载的单元格:

If you only need to access part of a worksheet, then you can define a Read Filter to identify just which cells you actually want to load:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3'; 

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */ 
class MyReadFilter implements PHPExcel_Reader_IReadFilter {
    public function readCell($column, $row, $worksheetName = '') {
        //  Read rows 1 to 7 and columns A to E only 
        if ($row >= 1 && $row <= 7) {
           if (in_array($column,range('A','E'))) { 
              return true;
           }
        } 
        return false;
    }
}

/**  Create an Instance of our Read Filter  **/ 
$filterSubset = new MyReadFilter(); 
/** Create a new Reader of the type defined in $inputFileType **/ 
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Advise the Reader of which WorkSheets we want to load 
     It's more efficient to limit sheet loading in this manner rather than coding it into a Read Filter  **/ 
$objReader->setLoadSheetsOnly($sheetname); 
echo 'Loading Sheet using filter';
/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/ 
$objReader->setReadFilter($filterSubset); 
/**  Load only the rows and columns that match our filter from $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

使用读取过滤器,您还可以块式"读取工作簿,以便在任何时候都只有一个块驻留在内存中:

Using read filters, you can also read a workbook in "chunks", so that only a single chunk is memory-resident at any one time:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example2.xls';

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */ 
class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
    private $_startRow = 0;
    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */ 
    public function setRows($startRow, $chunkSize) { 
        $this->_startRow    = $startRow; 
        $this->_endRow      = $startRow + $chunkSize;
    } 

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow 
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) { 
           return true;
        }
        return false;
    } 
}

/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Define how many rows we want to read for each "chunk"  **/ 
$chunkSize = 20;
/**  Create a new Instance of our Read Filter  **/ 
$chunkFilter = new chunkReadFilter(); 
/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/ 
$objReader->setReadFilter($chunkFilter); 

/**  Loop to read our worksheet in "chunk size" blocks  **/ 
/**  $startRow is set to 2 initially because we always read the headings in row #1  **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) { 
    /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/ 
    $chunkFilter->setRows($startRow,$chunkSize); 
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/ 
    $objPHPExcel = $objReader->load($inputFileName); 
    //    Do some processing here 

    //    Free up some of the memory 
    $objPHPExcel->disconnectWorksheets(); 
    unset($objPHPExcel); 
}

如果您不需要加载格式信息,而仅加载工作表数据,则setReadDataOnly()方法将告诉读者仅加载单元格值,而忽略任何单元格格式:

If you don't need to load formatting information, but only the worksheet data, then the setReadDataOnly() method will tell the reader only to load cell values, ignoring any cell formatting:

$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
/** Create a new Reader of the type defined in $inputFileType **/ 
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell data, not formatting **/ 
$objReader->setReadDataOnly(true);
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

使用单元缓存.这是一种减少每个单元所需的PHP内存的方法,但是会降低速度.它通过以压缩格式存储单元对象或在PHP内存(例如磁盘,APC,内存缓存)之外存储单元对象而工作...但是,保存的内存越多,脚本执行的速度就越慢.但是,您可以将每个单元所需的内存减少到大约300bytes,因此假设的5M单元将需要大约1.4GB的PHP内存.

Use cell caching. This is a method for reducing the PHP memory that is required for each cell, but at a cost in speed. It works by storing the cell objects in a compressed format, or outside of PHP's memory (eg. disk, APC, memcache)... but the more memory you save, the slower your scripts will execute. You can, however, reduce the memory required by each cell to about 300bytes, so the hypothetical 5M cells would require about 1.4GB of PHP memory.

单元格缓存在开发人员文档的4.2.1节中进行了描述

Cell caching is described in section 4.2.1 of the Developer Documentation

编辑

在查看代码时,您使用的并不是特别有效的迭代器,并建立了单元数据数组.您可能想要查看toArray()方法,该方法已经内置在PHPExcel中,并且可以为您完成此操作.另请参阅关于SO的最近的讨论有关要构建的新变体方法rangeToArray()的内容行数据的关联数组.

Looking at your code, you're using the iterators, which aren't particularly efficient, and building up an array of cell data. You might want to look at the toArray() method, which is already built into PHPExcel, and does this for you. Also take a look at this recent discussion on SO about the new variant method rangeToArray() to build an associative array of row data.

这篇关于PHPExcel用尽了256、512和1024MB的RAM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆