文本打包算法 [英] Text packing algorithm

查看：13 发布时间：2021/12/22 20:08:40 algorithm text packing

本文介绍了文本打包算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我敢打赌之前有人解决了这个问题，但我的搜索结果是空的.

I bet somebody has solved this before, but my searches have come up empty.

我想将一个单词列表打包到缓冲区中，跟踪每个单词的起始位置和长度.诀窍是我想通过消除冗余来有效地打包缓冲区.

I want to pack a list of words into a buffer, keeping track of the starting position and length of each word. The trick is that I'd like to pack the buffer efficiently by eliminating the redundancy.

示例:娃娃屋

这些可以像dollhouse一样简单地打包到缓冲区中，记住doll是从位置0开始的四个字母，dollhouse是九个字母在 0 处，house 在 3 处是五个字母.

These can be packed into the buffer simply as dollhouse, remembering that doll is four letters starting at position 0, dollhouse is nine letters at 0, and house is five letters at 3.

到目前为止我想出的是:

What I've come up with so far is:

从最长到最短的单词排序:(娃娃屋、房子、娃娃)
扫描缓冲区以查看该字符串是否已作为子字符串存在，如果存在，请注意位置.
如果它不存在，请将其添加到缓冲区的末尾.

由于长词通常包含较短的词，因此效果很好，但应该可以做得更好.例如，如果我扩展单词列表以包含 ragdoll，那么我的算法会提出 dollhouseragdoll，它的效率低于 ragdollhouse.

Since long words often contain shorter words, this works pretty well, but it should be possible to do significantly better. For example, if I extend the word list to include ragdoll, then my algorithm comes up with dollhouseragdoll which is less efficient than ragdollhouse.

这是一个预处理步骤，所以我不太担心速度.O(n^2) 没问题.另一方面，我的实际列表有数万个单词，所以 O(n!) 可能是不可能的.

This is a preprocessing step, so I'm not terribly worried about speed. O(n^2) is fine. On the other hand, my actual list has tens of thousands of words, so O(n!) is probably out of the question.

作为旁注，此存储方案用于 TrueType 字体的 `name' 表中的数据，参见.http://www.microsoft.com/typography/otspec/name.htm

As a side note, this storage scheme is used for the data in the `name' table of a TrueType font, cf. http://www.microsoft.com/typography/otspec/name.htm

文本打包算法 [英] Text packing algorithm

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

文本打包算法 [英] Text packing algorithm

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭