优化ClosedXML循环和行删除的性能 [英] Optimizing performance for ClosedXML loops and row deletion
问题描述
我正在读取Excel文件并循环浏览各行,删除符合条件的行
I'm reading an Excel file and looping through the rows, deleting those that meet a condition
using (var wb = new XLWorkbook(path))
{
var ws = wb.Worksheet(sheet);
int deleted = 0;
for (int row_i = 2; row_i <= ws.LastRowUsed().RowNumber(); row_i++)
{
ExcelRow row = new ExcelRow(ws.Row(row_i-deleted));
row.styleCol = header.styleCol;
K key = keyReader(row);
if (!writeData(row,dict[key])) deleted++;
}
wb.Save();
}
对于具有数千行的文件,即使没有删除,或者必须删除数百行时,代码也非常慢.
The code is very slow for a file with thousands of rows, even without deletions, or when hundreds of rows must be deleted.
推荐答案
您必须执行2个重要的优化. 第一行很琐碎,但影响很大:您需要存储最后一行,因为获取该行的函数耗时,比您期望的要多.
There are 2 important optimizations you have to do. The first is quite trivial, but has a great impact: you need to store the last row, because the function to get it is time expensive, more than you could expect.
int lastrow = ws.LastRowUsed().RowNumber();
for (int row_i = 2; row_i <= lastrow; row_i++)
第二个涉及更多,它与不删除单个范围时的多个(且缓慢的)行/单元格移位(XLShiftDeletedCells.ShiftCellsUp
)有关.在这种情况下,我可以建议一种解决方法.请勿在writeData
期间删除单行-请注意,因此您不会递减
The second is a bit more involved and it is related to the multiple (and slow) row/cell shifts (XLShiftDeletedCells.ShiftCellsUp
) when you don't delete a single range. In that case I can suggest a workaround. Do not delete the single row during your writeData
- notice that therefore you won't decrement
ExcelRow row = new ExcelRow(ws.Row(row_i)); // no deletion in the loop
您的循环索引-但是暂时添加一列(temp_col
)将行标记为"ok
"或"skip
"并最终对其进行排序,以便您可以删除单个范围内的所有行
your loop index - but momentarily add a column (temp_col
) to mark the rows as "ok
" or "skip
" and eventually sort it, so that you can delete all the rows in a single range.
if (deleted > 0)
{
int lastcol = ws.LastColumnUsed().ColumnNumber();
var tab = ws.Range(ws.Cell(2, 1), ws.Cell(lastrow, lastcol));
tab.Sort(temp_col);
tab = ws.Range(ws.Cell(lastrow - deleted + 1, 1), ws.Cell(lastrow, lastcol));
tab.Delete(XLShiftDeletedCells.ShiftCellsUp);
}
ws.Column(temp_col).Delete();
性能测试
无需添加关于第一点的任何内容.第二个是这个答案的原始内容,我可以确认,通过测量用Stopwatch
经过的时间,观察到的执行时间减少了超过80%在我的情况下(200到27秒).
Performance Test
No need to add anything about the first point. The second is original of this answer and I can confirm that, by measuring the elapsed time with a Stopwatch
, the observed reduction of the execution time is more than 80% in my situation (from 200 to 27 seconds).
这篇关于优化ClosedXML循环和行删除的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!