什么是最适合实现编辑器的数据结构,如记事本? [英] What is best data structure suitable to implement editor like notepad?

查看:511
本文介绍了什么是最适合实现编辑器的数据结构,如记事本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在执行编辑器(如记事本)中使用哪些数据结构。这种数据结构应该是可扩展的,应该支持诸如版本,删除,滚动,文本范围的选择等各种功能?

Which data structure/s is used in implementation of editors like notepad. This data structure should be extensible, and should support various features like edition, deletion, scrolling, selection of range of text etc?

推荐答案

我们为旧机器编写了一个编辑器(请记住,这是一段时间,大约是1986年,所以这是从现在开始,从现在开始,技术水平可能会有所提升),我们设法通过使用自我管理池中的固定内存块来尖锐地表现出来,表现明智。

We wrote an editor for an old machine (keep in mind that this was a while ago, about 1986, so this is from memory, and the state of the art may have advanced somewhat since then) which we managed to get to scream along, performance wise, by using fixed memory blocks from self-managed pools.

它有两个池,每个池包含固定数量的特定大小块(一个池用于线路结构,另一个用于线段结构)。它基本上是链表的链接列表。

It had two pools, each containing a fixed number of specific-sized blocks (one pool was for line structures, the other for line-segment structures). It was basically a linked list of linked lists.

malloc()

Memory was pre-allocated (for each region) from a 'malloc()'-like call, and we used 65,535 blocks (0 through 65,534 inclusive, block number 65,535 was considered the null block, an end-of-list indicator).

这允许每个65,553行(填充版本为384K或512K)和大约1.6G的文件大小(占用2G的分配空间),这是漂亮的那么大。那就是理论上的文件大小限制 - 我不认为我们在现实中已经接近了,因为我们从未分配完整的行细分类结构。

This allowed each for 65, 535 lines (384K or 512K for the padded version) and about 1.6G of file size (taking 2G of allocated space), which was pretty big back then. That was the theoretical file size limit - I don't think we ever approached that in reality since we never allocated the full set of line segment structures.

不必为每个小内存块调用 malloc(),这给我们带来了巨大的增长速度,特别是我们可以优化我们自己的固定大小块的内存分配程序(包括在最终优化版本中内联呼叫)。

Not having to call malloc() for every little block of memory gave us a huge speed increase, especially as we could optimise our own memory allocation routines for fixed size blocks (including inlining the calls in the final optimised version).

两个池中的结构如下,每行为单个字节):

The structures in the two pools were as follows, with each line being a single byte):


Line structure (6/8 bytes)     Line-segment structure (32 bytes)
+--------+                     +--------+
|NNNNNNNN|                     |nnnnnnnn|
|NNNNNNNN|                     |nnnnnnnn|
|PPPPPPPP|                     |pppppppp|
|PPPPPPPP|                     |pppppppp|
|bbbbbbbb|                     |LLLLLLLL|
|bbbbbbbb|                     |LLLLLLLL|
|........|                     |xxxxxxxx|
|........|                     :25 more :
+--------+                     : x lines:
                               +--------+

其中:


  • code> x 指向线段池。

  • 大写字母指向行池。

  • N 是下一行的块号(null表示文件中的最后一行)。

  • P 上一行的块号(null表示文件中的第一行)。

  • b 是该行中第一行段的块号(null表示该行为空)。

  • 被保留填充(将结构颠倒到8个字节)。

  • n 是下一个的块号线段(null意味着这是行中的最后一个细分)

  • p 是上一个线段的块号null意思是这个wa是该行中的第一个细分。

  • L 是该段的行块的块号。

  • x 是该行中的26个字符。

  • Lower-case letters other than x point to the line segment pool.
  • Upper-case letters point to the line pool.
  • N was a block number for the next line (null meaning this was the last line in the file).
  • P the the block number for the previous line (null meaning this was the first line in the file).
  • b was the block number for the first line segment in that line (null meaning the line was empty).
  • . was reserved padding (to bump the structure out to 8 bytes).
  • n was the block number for the next line segment (null meaning this was the last segment in the line).
  • p was the block number for the previous line segment (null meaning this was the first segment in the line).
  • L was the block number for the segment's line block.
  • x was the 26 characters in that line segment.

线结构被填充的原因是将块数转换为实际存储器位置(向左移动3位,比在特定架构中乘以6快得多,使用的额外内存只有128K,与总共存储使用)虽然我们确实为那些关心内存的人提供了更慢的版本。

The reason the line structure was padded was to speed up the conversion of block numbers into actual memory locations (shifting left by 3 bits was much faster than multiplying by 6 in that particular architecture and extra memory used was only 128K, minimal compared to the total storage used) although we did provide the slower version for those who cared more about memory.

我们还有一个包含行段的100位16位数组(和行号,所以我们可以快速去特定的行)大致百分比(所以数组[7]是文件大约7%的行)和两个空闲指针来维护每个池中的空闲列表(这个是一个非常简单的单向列表,其中 N n 在结构中下一个空闲块,空闲块从这些列表的前面分配并放回到这些列表的前面)。

We also had an array of 100 16-bit values which contained the line segment (and line number so we could quickly go to specific lines) at roughly that percentage (so that array[7] was the line that was roughly 7% into the file) and two free pointers to maintain the free list in each pool (this was a very simple one way list where N or n in the structure indicated the next free block and free blocks were allocated from, and put back to, the front of these lists).

没有必要保留字符数在每个线段中,因为0字节在文件中无效。每个线段被允许在最后被完全忽略的0字节。每当它们被修改时,线被压缩(即,组合线段)。这使得块的使用率不高(垃圾收集不经常且冗长),也大大加快了搜索和替换操作。

There was no need to keep a count of the characters in each line segment since 0-bytes were not valid in files. Each line segment was allowed to have 0-bytes at the end that were totally ignored. Lines were compressed (i.e., line segments were combined) whenever they were modified. This kept block usage low (without infrequent and lengthy garbage collection) and also greatly sped up search-and-replace operations.

使用这些结构允许 快速编辑,插入,删除,搜索和浏览文字,这是您可能会在简单的文本编辑器中获得大部分性能问题的地方。

The use of these structures allowed very fast editing, insertion, deletion, searching and navigation around the text, which is where you're likely to get most of your performance problems in a simple text editor.

使用选择(我们没有实现这一点,因为它是一个文本模式编辑器,使用类似vi的命令,如 3d 删除3行或<$可以通过使用 {line#/ block,char-pos} 元组来标记位置来实现c $ c> 6x 删除6个字符)在文本中,并使用其中的两个元组作为选择范围。

The use of selections (we didn't implement this as it was a text mode editor that used vi-like commands such as 3d to delete 3 lines or 6x to delete 6 characters) could be implemented by having a {line#/block, char-pos} tuple to mark positions in the text, and use two of those tuples for a selection range.

这篇关于什么是最适合实现编辑器的数据结构,如记事本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆