C#Excel Interop缓慢的循环通过单元格 [英] C# Excel Interop Slow when looping through cells

查看:151
本文介绍了C#Excel Interop缓慢的循环通过单元格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在C#中从Excel文档中提取所有文本数据,并且出现性能问题。在下面的代码中,我打开了工作簿,循环遍历所有工作表,并循环使用范围中的所有单元格,当我走时,从每个单元格中提取文本。问题是,这需要14秒执行。

  public class ExcelFile 
{
public string Path = @C:\test.xlsx;
private Excel.Application xl = new Excel.Application();
private Excel.Workbook WB;
public string FullText;
private Excel.Range rng;
私人字典< string,string>变量;
public ExcelFile()
{
WB = xl.Workbooks.Open(Path);
xl.Visible = true;
foreach(WB.Worksheets中的Excel.Worksheet CurrentWS)
{
rng = CurrentWS.UsedRange; (int i = 1; i< rng.Count; i ++)
{FullText + = rng.Cells [i] .Value; }
}
WB.Close(false);
xl.Quit();
}
}

而在VBA中,我会做这样的事情,需要〜1秒:

  Sub run()
Dim strText As String
对于ActiveWorkbook中的每个ws .Sets
对于每个c在ws.UsedRange
strText = strText& c.Text
下一个c
下一个ws
End Sub

或者,甚至更快(少于1秒):

  Sub RunFast()
Dim strText As String
Dim varCells As Variant
对于每个ws在ActiveWorkbook.Sheets
varCells = ws.UsedRange
对于i = 1到UBound(varCells,1)
对于j = 1到UBound (varCells,2)
strText = strText& CStr(varCells(i,j))
下一个j
下一个i
下一个ws
End Sub

在C#中的for循环中可能有一些我不知道的事情?可以将范围加载到数组类型的对象(如我的最后一个例子中),以允许迭代只是值,而不是单元格对象?

解决方案

Excel和C#完全在不同的环境中运行。 C#在.NET框架中使用托管内存运行,而Excel是本机C ++应用程序,并在非托管内存中运行。在这两个之间翻译数据(称为封送的过程)在性能方面非常昂贵。



调整代码不会有帮助。与编组过程相比,循环,字符串构造等都是惊人的。您将获得更好的性能的唯一方法是减少必须跨越进程间边界的行程数量。单元格提取数据单元格永远不会让您获得所需的性能。



这里有几个选项:


  1. 在VBA中编写一个子程序或函数,执行所需的所有操作,然后通过interop调用该子程序或函数。 演练


  2. 使用interop将工作表保存为CSV格式的临时文件,然后使用C#打开文件。您将需要循环并解析文件才能将其转化为有用的数据结构,但这个循环将会更快。


  3. 使用interop保存单元格范围到剪贴板,然后使用C#直接读取剪贴板。



I am trying to extract all text data from an Excel document in C# and am having performance issues. In the following code I open the Workbook, loop over all worksheets, and loop over all cells in the used range, extracting the text from each cell as I go. The problem is, this takes 14 seconds to execute.

public class ExcelFile
{
    public string Path = @"C:\test.xlsx";
    private Excel.Application xl = new Excel.Application();
    private Excel.Workbook WB;
    public string FullText;
    private Excel.Range rng;
    private Dictionary<string, string> Variables;
    public ExcelFile()
    {
        WB = xl.Workbooks.Open(Path);
        xl.Visible = true;
        foreach (Excel.Worksheet CurrentWS in WB.Worksheets)
        {
            rng = CurrentWS.UsedRange;
            for (int i = 1; i < rng.Count; i++)
            { FullText += rng.Cells[i].Value; }
        }
        WB.Close(false);
        xl.Quit();
    }
}

Whereas in VBA I would do something like this, which takes ~1 second:

Sub run()
    Dim strText As String
    For Each ws In ActiveWorkbook.Sheets
        For Each c In ws.UsedRange
            strText = strText & c.Text
        Next c
    Next ws
End Sub

Or, even faster (less than 1 second):

Sub RunFast()
    Dim strText As String
    Dim varCells As Variant
    For Each ws In ActiveWorkbook.Sheets
        varCells = ws.UsedRange
        For i = 1 To UBound(varCells, 1)
            For j = 1 To UBound(varCells, 2)
                strText = strText & CStr(varCells(i, j))
            Next j
        Next i
    Next ws
End Sub

Perhaps something is happening in the for loop in C# that I'm not aware of? Is it possible to load a range into an array-type object (as in my last example) to allow iteration over just the values, not the cell objects?

解决方案

Excel and C# run in different environments completely. C# runs in the .NET framework using managed memory while Excel is a native C++ application and runs in unmanaged memory. Translating data between these two (a process called "marshaling") is extremely expensive in terms of performance.

Tweaking your code isn't going to help. For loops, string construction, etc. are all blazingly fast compared to the marshaling process. The only way you are going to get significantly better performance is to reduce the number of trips that have to cross the interprocess boundary. Extracting data cell by cell is never going to get you the performance you want.

Here are a couple options:

  1. Write a sub or function in VBA that does everything you want, then call that sub or function via interop. Walkthrough.

  2. Use interop to save the worksheet to a temporary file in CSV format, then open the file using C#. You will need to loop through and parse the file to get it into a useful data structure, but this loop will go much faster.

  3. Use interop to save a range of cells to the clipboard, then use C# to read the clipboard directly.

这篇关于C#Excel Interop缓慢的循环通过单元格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆