什么架构使用来解决这个SystemOutOfMemoryException同时允许我实例片材的细胞? [英] What architecture to use to address this SystemOutOfMemoryException while allowing me to instantiate the cells of a sheet?

查看:161
本文介绍了什么架构使用来解决这个SystemOutOfMemoryException同时允许我实例片材的细胞?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  

摘要

这问题是后续AA愿望,构建一个简单的小号preadsheet API,同时保持其用户友好的那些谁知道Excel的好。

要概括起来,这个问题是有关这两个如下:
1。<一个href="http://stackoverflow.com/questions/5160001/how-to-implement-column-self-naming-from-its-index">How从它的索引实现柱自命名;
2. <一href="http://stackoverflow.com/questions/5173833/how-to-make-this-custom-worksheet-initialization-faster">How让这个自定义工作表的初始化速度更快?。

  

目的

要提供用作包装在nevralgic组件,如应用程序的简化的Excel API 工作簿 ,在工作表范围类/时只露出最常用的每种对象属性的接口。

  

使用实例

本使用实例是从单元测试,让我把这个解决方案最多的地方现在站立的启发。

 昏暗的文件为String =C:\ TEMP \ WriteTest.xls

用经理因为是preadsheetManager =全新S preadsheetManager()
    昏暗WB为IWorkbook = mgr.CreateWorkbook()
    wb.Sheets(表Sheet 1)。将细胞(A1)。值= 3.1415926
    wb.SaveAs(文件)
结束使用
 

现在我们打开它:

 昏暗的文件为String =C:\ TEMP \ WriteTest.xls

用经理因为是preadsheetManager =全新S preadsheetManager()
    昏暗的WB为IWorkbook = mgr.OpenWorkbook(文件)
    //与工作簿在这里工作...
结束使用
 

  

讨论

在实例的Excel工作簿:

  1. 在工作表的实例自动在Workbook.Sheets集合初始化;
  2. 在初始化时,工作表通过范围对象,可以重新present一个或多个单元初始化它的细胞。

这些细胞是立即与所有的属性可访问只要在工作表是否存在。

我的愿望是要重现此行为,以便

  1. 在工作簿类的构造函数初始化与本地张Workbook.Sheets集合属性;
  2. 在工作表类的构造函数初始化一个与原生细胞Worksheet.Cells集合属性。

我的问题来自于工作表类的构造函数初始化,而在#2所示的Worksheet.Cells集合属性。

  

思想

在这些上面联的问题遇到的问题,我想找出另一种架构,可以让我:

  1. 在一个小区的访问特定功能范围必要时;
  2. 通过我的 ICELL 界面交付最常用的属性;
  3. 能够直接访问所有的初始化工作表的范围细胞。

虽然牢记访问 Range.Value 属性是最快的相互作用可能与使用互操作底层的Excel应用程序实例。

所以,我想初始化我的 ReadonlyOnlyDictionary(串,ICELL)与细胞的名称,而无需立即包裹范围的一个实例接口,这样我就简单地生成行和列的索引和单元的名称索引我的字典里,那么,分配 Cell.NativeCell 属性只有当一个人要访问或格式化特定的单元格或单元格区域。

这样一来,在字典中的数据将被编入索引从工作表类的构造函数生成的列索引获得的细胞的名称。于是,当一个人会做这样的:

 使用经理因为是preadsheetManager =全新S preadsheetManager()
    昏暗WB作为IWorkbook = mgr.CreateWorkbook()
    。wb.Sheet(1).Cells(A1)值= 3.1415926 //#1:
结束使用
 

1 这将允许我使用索引从我的细胞类写的给定值的特定的细胞,这是快则直接使用它的名字对范围

  

问题和疑虑

此外,随着 UsedRange.get_Value()工作时 Cells.get_Value(),这返回Object(,)阵列。

1。所以,我应该只是很高兴与对象的工作(,)阵列单元,而无需可能以某种方式格式化吗?

2。如何构建这些工作表和单元格类,以便我在与对象的工作提供了最佳的性能(,)阵列,同时保持一个小区实例可能重新$ P的可能性$ psent或包裹的单个单元格范围?

  

感谢你们任何人谁需要时间来阅读我的文章和我最诚挚的感谢那些谁回答。

解决方案

所使用的架构已经通过我命名为 CellCollection 的对象类不见了。下面是它做什么:

根据这些假设:

  
      
  1. 由于Excel工作表有256列和65536行;

  2.   
  3. 由于需要在同一时间被实例化16777216(256 * 65536)细胞;

  4.   
  5. 由于使用最普遍的一个工作表只需花费不到1000行,小于100列;

  6.   
  7. 考虑到我需要它能够指细胞与他们的地址(A1);和

  8.   
  9. 由于它为基准,在访问的所有值一次并将它们加载到一个对象[,] 内存作为一起工作的最快方法一个基本的Excel工作表,*

  10.   

我认为没有实例化任何细胞,让在我的 IWorksheet 界面我的 CellCollection 属性初始化空实例化后,除了现有的工作簿。因此,打开工作簿时,我确认 NativeSheet.UsedRange 为空或返回null(在Visual Basic中为Nothing),否则,我已经得到了使用原生细胞在内存中,这样不仅仍然添加它们在我的内部 CellCollection 词典,同时与它们各自的地址索引他们。

最后,延迟初始化设计模式来救援! =)

 公共类表:ISheet {
    公共表(Microsoft.Office.Interop.Excel.Worksheet nativeSheet){
        NativeSheet = nativeSheet;
        细胞=新CellCollection(本);
    }

    公共Microsoft.Office.Interop.Excel.Worksheet NativeSheet {获得;私定; }

    公共CellCollection细胞{获得;私定; }
}

公共密封类CellCollection {
    私人的IDictionary&LT;字符串,ICELL&GT; _cells;
    私人ReadOnlyDictionary&LT;字符串,ICELL&GT; _readonlyCells;

    公共CellCollection(ISheet表){
        _cells =新字典&LT;字符串,ICELL&GT;();
        _readonlyCells =新ReadonlyDictionary&LT;字符串,ICELL&GT;(_细胞);
        表=表;
    }

    公共只读ReadOnlyDictionary&LT;字符串,ICELL&GT;细胞(串地址){
        得到 {
            如果(string.IsNullOrEmpty(地址)|| 0 = address.Trim()。长度)
                抛出新ArgumentNullException(地址);

            (!Regex.IsMatch(地址,(([A-ZA-Z] {1,2,3} [0-9] *)[:,] *)))如果
                抛出新出现FormatException(地址);

            在addresses.Split的foreach(串地址(,){
                Microsoft.Office.Interop.Excel.Range范围= Sheet.NativeSheet.Range(地址)

                的foreach(Microsoft.Office.Interop.Excel.Range单元格范围内){
                    ICELL C = NULL;
                    如果(!_cells.TryGetValue(cell.Address(假的,假的),C)){
                        C =新Cell(细胞);
                        _cells.Add(c.Name,C);
                    }
                }
            }

            返回_readonlyCells;
        }
    }

    公共只读ISheet表{获得;私定; }
}
 

显然,这是第一次尝试拍摄,它工作得很好,到目前为止,有超过可接受的性能。虚心不过,我觉得它可以使用一些优化,但我会用这种方式,现在,和将来有需要优化它。

写过这个集合后,我能来预期的行为。现在,我将尝试执行一些.NET接口,使其可用对一些的IEnumerable 的IEnumerable&LT; T&GT; 的ICollection 的ICollection&LT; T&GT; 等,使得它可以分别被认为是一个真正的.NET集合

随意发表意见,并带来建设性的替代品和/或修改此code,使其有可能成为更大的比现在。

我希望这将成为一个人的目的的一天。

感谢您的阅读! =)

Summary

This question is the follow-up of a a desire to architect a simple spreadsheet API while keeping it user-friendly to those who know Excel well.

To sum it up, this question is related to these below two:
1. How to implement column self-naming from its index?;
2. How to make this custom worksheet initialization faster?.

Objective

To provide a simplified Excel API used as a wrapper over the nevralgic components such as the Application, the Workbook, the Worksheet and the Range classes/interfaces while exposing only the most commonly used object properties for each of these.

Usage example

This usage example is inspired from the unit tests that allowed me to bring this solution up to where it stands now.

Dim file as String = "C:\Temp\WriteTest.xls"

Using mgr As ISpreadsheetManager = New SpreadsheetManager()
    Dim wb as IWorkbook = mgr.CreateWorkbook()
    wb.Sheets("Sheet1").Cells("A1").Value = 3.1415926
    wb.SaveAs(file)
End Using

And now we open it:

Dim file as String = "C:\Temp\WriteTest.xls"

Using mgr As ISpreadsheetManager = New SpreadsheetManager()
    Dim wb as IWorkbook = mgr.OpenWorkbook(file)
    // Working with workbook here...
End Using

Discussion

While instantiating an Excel Workbook:

  1. An instance of a Worksheet is automatically initialized in the Workbook.Sheets collection;
  2. Upon initialization, a Worksheet initializes its Cells through the Range object that can represent one or multiple cells.

These Cells are immediately accessible with all their properties as soon as the Worksheet exists.

My wish is to reproduce this behaviour so that

  1. The Workbook class constructor initializes the Workbook.Sheets collection property with the native sheets;
  2. The Worksheet class constructor initializes the Worksheet.Cells collection property with the native cells.

My problem comes from the Worksheet class constructor while initializing the Worksheet.Cells collection property illustrated at #2.

Thoughts

Following these above-linked questions encountered issues, I wish to figure out another architecture that would allow me to:

  1. Access specific feature of a cell Range when required;
  2. Deliver most commonly used properties through my ICell interface;
  3. Having access to all of the Range cells of a worksheet from its initialization.

While keeping in mind that accessing a Range.Value property is the fastest interaction possible with the underlying Excel application instance using the Interop.

So, I thought of initializing my ReadonlyOnlyDictionary(Of String, ICell) with the name of the cells without immediately wrapping an instance of the Range interface so that I would simply generate the row and column indexes along with the cell's name to index my dictionary, then, assigning the Cell.NativeCell property only when one wants to access or format a specific cell or cell range.

That way, the data in the dictionary would be indexed with the name of the cells obtained from the column indexes generated in the Worksheet class constructor. Then, when one would do this:

Using mgr As ISpreadsheetManager = New SpreadsheetManager()
    Dim wb As IWorkbook = mgr.CreateWorkbook()
    wb.Sheet(1).Cells("A1").Value = 3.1415926 // #1:
End Using

#1: This would allow me to use the indexes from my Cell class to write the given value to the specific cell, which is faster then using its name directly against the Range.

Questions and Concerns

Besides, when working with UsedRange.get_Value() or Cells.get_Value(), this returns Object(,) arrays.

1. So should I just be happy with working with Object(,) arrays for cells, without having the possibility to format it somehow?

2. How to architect these Worksheet and Cell classes so that I have the best performance offered while working with Object(,) arrays, while keeping the possibility that a Cell instance may represent or wrap a single cell Range?

Thanks to any of you who takes the time to read my post and my sincerest thanks to those who answer.

解决方案

The used architecture has gone through an object class that I named CellCollection. Here's what it does:

Based on these hypothesis:

  1. Given that an Excel worksheet has 256 columns and 65536 lines;

  2. Given that 16,777,216 (256 * 65536) cells needed to be instantiated at a time;

  3. Given that the most common use of a worksheet takes less then 1,000 lines and less than 100 columns;

  4. Given that I needed it to be able to refer to the cells with their addresses ("A1"); and

  5. Given that it is benchmarked that accessing all the values at once and load them into a object[,] in memory as being the fastest way to work with an underlying Excel worksheet,*

I have considered not to instantiate any of the cells, letting my CellCollection property within my IWorksheet interface initialized and empty upon instantiation, except for an existing workbook. So, when opening a workbook, I verify that NativeSheet.UsedRange is empty or return null (Nothing in Visual Basic), otherwise, I have already gotten the used "native cells" in memory so that only remains to add them in my internal CellCollection dictionary while indexing them with their respective address.

Finally, Lazy Initialization Design Pattern to the rescue! =)

public class Sheet : ISheet {
    public Worksheet(Microsoft.Office.Interop.Excel.Worksheet nativeSheet) {
        NativeSheet = nativeSheet;
        Cells = new CellCollection(this);
    }

    public Microsoft.Office.Interop.Excel.Worksheet NativeSheet { get; private set; }

    public CellCollection Cells { get; private set; }
}

public sealed class CellCollection {
    private IDictionary<string, ICell> _cells;
    private ReadOnlyDictionary<string, ICell> _readonlyCells;

    public CellCollection(ISheet sheet) {
        _cells = new Dictionary<string, ICell>();
        _readonlyCells = new ReadonlyDictionary<string, ICell>(_cells);
        Sheet = sheet;
    }

    public readonly ReadOnlyDictionary<string, ICell> Cells(string addresses) {
        get {
            if (string.IsNullOrEmpty(addresses) || 0 = address.Trim().Length)
                throw new ArgumentNullException("addresses");

            if (!Regex.IsMatch(addresses, "(([A-Za-z]{1,2,3}[0-9]*)[:,]*)"))
                throw new FormatException("addresses");

            foreach(string address in addresses.Split(",") {
                Microsoft.Office.Interop.Excel.Range range = Sheet.NativeSheet.Range(address)

                foreach(Microsoft.Office.Interop.Excel.Range cell in range) {
                    ICell c = null;
                    if (!_cells.TryGetValue(cell.Address(false, false), c)) { 
                        c = new Cell(cell);
                        _cells.Add(c.Name, c);
                    }
                }
            }

            return _readonlyCells;
        }
    }

    public readonly ISheet Sheet { get; private set; }
}

Obviously, this is a first try shot, and it works just fine so far, with more than acceptable performance. Humbly though, I feel like it could use some optimizations, though I will use it this way for now, and optimize it later if needed.

After having written this collection, I was able to come to the expected behaviour. Now, I shall try to implement some of the .NET interfaces to make it useable against some IEnumerable, IEnumerable<T>, ICollection, ICollection<T>, etc. so that it may respectively be considered as a true .NET collection.

Feel free to comment and bring constructive alternatives and/or changes to this code so that it may become even greater than it currently is.

I DO hope this will serve one's purpose someday.

Thanks for reading! =)

这篇关于什么架构使用来解决这个SystemOutOfMemoryException同时允许我实例片材的细胞?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆