用于Excel克隆的正确数据结构 [英] The right data structure to use for an Excel clone
问题描述
我的网格表示如下:
private struct CellValue
{
private int column ;
private int row;
私人字符串文字;
}
private List< CellValue> cellValues = new List< CellValue>();
每次用户添加一个文本,我只是将其打包成CellValue并将其添加到cellValues中。给定一个CellValue类型,我可以确定它的行和列在O(1)时间,这是伟大的。然而,给定列和行,我需要循环遍历整个cellValues,以查找该列和行中的哪个文本,这是非常慢的。另外,给定一个文本,我也需要循环遍及整个事情。是否有任何数据结构,我可以在O(1)时间内获得所有3个任务?
更新:
通过一些答案,认为我找到了一个我喜欢的。我可以:
- 不要保留2个以上的CellValue副本,以避免同步。在C世界中,我会很好地使用指针。
- 行和列可以动态添加(与Excel不同)。
我将选择一个稀疏数组(链表的链表),以最小的存储空间提供最大的灵活性。
在这个例子中,你有一个行列的链接列表,每个元素指向该行中单元格的链接列表(您可以根据需要反转单元格和行)。
|
V
+ - + + --- + + --- +
| 1 | - > | 1.1 | ----------> | 1.3 | - :
+ - + + --- + + --- +
|
V
+ - + + --- +
| 7 | ----------> | 7.2 | - :
+ - + + --- +
|
=
每行元素都有行号,每个单元格都有一个指针到其行元素,以便从单元格获取行号为O(1)。
同样,每个单元格都有其列号,使得O(1 )
没有简单的方法可以获得O(1),以便在给定行/列中立即查找单元格,但是稀疏数组的速度与之前的速度一样快要获取,除非您为每个可能的单元格预先分配信息,以便可以对阵列进行索引查找 - 这在存储方面将非常浪费。
有一件事您可以做的是使一维不稀疏,例如使列成为主数组(而不是链表),并将其限制为1,000 - 这将使列查找索引(快速),然后在稀疏行上进行搜索。
我不认为你可以获得O(1)的文本查找,只是因为文本可以在多个单元格中复制s(不像行/列)。我仍然相信,稀疏数组将是搜索文本的最快方法,除非您在另一个数组中维护所有文本值的排序索引(同样可以使其更快,但以牺牲大量内存为代价)。 p> Let say I'm working on an Excel clone in C#.
My grid is represented as follows: Each time user add a text, I just package it as CellValue and add it into cellValues. Given a CellValue type, I can determine its row and column in O(1) time, which is great. However, given a column and a row, I need to loop through the entire cellValues to find which text is in that column and row, which is terribly slow. Also, given a text, I too need to loop through the entire thing. Is there any data structure where I can achive all 3 task in O(1) time? Updated:
Looking through some of the answers, I don't think I had found one that I like. Can I:
I would opt for a sparse array (a linked list of linked lists) to give maximum flexibility with minimum storage. In this example, you have a linked list of rows with each element pointing to a linked list of cells in that row (you could reverse the cells and rows depending on your needs). Each row element has the row number in it and each cell element has a pointer to its row element, so that getting the row number from a cell is O(1). Similarly, each cell element has its column number, making that O(1) as well. There's no easy way to get O(1) for finding immediately the cell at a given row/column but a sparse array is as fast as it's going to get unless you pre-allocate information for every possible cell so that you can do index lookups on an array - this would be very wasteful in terms of storage. One thing you could do is make one dimension non-sparse, such as making the columns the primary array (rather than linked list) and limiting them to 1,000 - this would make the column lookup indexed (fast), then a search on the sparse rows. I don't think you can ever get O(1) for a text lookup simply because text can be duplicated in multiple cells (unlike row/column). I still believe the sparse array will be the fastest way to search for text, unless you maintain a sorted index of all text values in another array (again, that can make it faster but at the expense of copious amounts of memory). 这篇关于用于Excel克隆的正确数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!private struct CellValue
{
private int column;
private int row;
private string text;
}
private List<CellValue> cellValues = new List<CellValue>();
|
V
+-+ +---+ +---+
|1| -> |1.1| ----------> |1.3| -:
+-+ +---+ +---+
|
V
+-+ +---+
|7| ----------> |7.2| -:
+-+ +---+
|
=