为什么FingerTrees没有足够的使用来稳定实现? [英] Why aren't FingerTrees used enough to have a stable implementation?

查看:166
本文介绍了为什么FingerTrees没有足够的使用来稳定实现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

前一段时间,我遇到了关于FingerTrees的文章(见此外,随附的堆栈溢出问题),并提交想法离开我终于找到了使用它们的理由。

A while ago, I ran across an article on FingerTrees (See Also an accompanying Stack Overflow Question) and filed the idea away. I have finally found a reason to make use of them.

我的问题是, Data.FingerTree包在边缘似乎有点腐烂。此外, Data.Sequence 使用数据结构重新实现(可能更好)的版本,但不导出它。

My problem is that the Data.FingerTree package seems to have a little bit rot around the edges. Moreover, Data.Sequence in the Containers package which makes use of the data structure re-implements a (possibly better) version, but doesn't export it.

从理论上讲,这个结构似乎是有用的,似乎没有很多的实际使用或注意力。有人发现FingerTrees作为一个实际的事情没有用,还是没有足够的关注?

As theoretically useful as this structure seems to be, it doesn't seem to get a lot of actual use or attention. Have people found that FingerTrees are not useful as a practical matter, or is this a case not enough attention?

进一步解释

我有兴趣构建一个具有良好连接属性的文本的数据结构。考虑从各种碎片构建HTML文档。大多数预制解决方案都使用测试,但我真的想要一些处理Unicode文本的东西。我目前的计划是将Data.Text片段分解成FingerTree。

I'm interested in building a data structure holding text that has good concatenation properties. Think about building an HTML document from assorted fragments. Most pre-built solutions use bytestrings, but I really want something that deals with Unicode text properly. My plan at the moment is to layer Data.Text fragments into a FingerTree.

我还要借用Data.Vector的技巧,无需复制就可以使用偏移,长度)操纵。 Data.Text.Text具有内置于数据类型的内容,但仅使用它来实现高效的无效和无效的操作。在FingerTree中,这个信息很容易成为树的 v 或注释。

I would also like to borrow the trick from Data.Vector of taking slices without copying using (offset,length) manipulation. Data.Text.Text has this built in to the data type, but only uses it for efficient uncons and unsnoc opperations. In FingerTree this information could very easily becomes the v or annotation of the tree.

推荐答案

为了回答您关于手指树的问题,我认为问题是与阵列相比具有相对较高的恒定成本,并且比实现高效连接的其他方法更为复杂。一个Builder有一个更加高效的接口,只需要添加块,而且它们通常是随时可用的(请参阅@ informatikr的答案中的链接)。假设 Data.Text.Lazy 是用链接的块列表实现的,你正在创建一个 Data.Text.Lazy 从构建器。除非你有很多块(可能超过50个),或者重复访问列表结尾附​​近的数据,否则手指树的高恒定成本可能不值得。

To answer your question about finger trees in particular, I think the problem is that they have relatively high constant costs compared to arrays, and are more complex than other ways of achieving efficient concatenation. A Builder has a more efficient interface for just appending chunks, and they're usually readily available (see the links in @informatikr's answer). Suppose that Data.Text.Lazy is implemented with a linked list of chunks, and you're creating a Data.Text.Lazy from a builder. Unless you have a lot of chunks (probably more than 50), or are accessing data near the end of the list repeatedly, the high constant cost of a finger tree probably isn't worth it.

由于性能原因, Data.Sequence 实现是专门的,并不像通常由 fingerertree提供的完整界面包。这就是为什么它不出口;

The Data.Sequence implementation is specialized for performance reasons, and isn't as general as the full interface provided by the fingertree package. That's why it isn't exported; it's not really possible to use it for anything other than a Sequence.

我也怀疑很多程序员都在关于如何实际使用monoidal注释的损失,因为它背后是相当重要的抽象屏障。很多人不会使用它,因为他们看不到与其他数据类型相比可以有用的。

I also suspect that many programmers are at a loss as to how to actually use the monoidal annotation, as it's behind a fairly significant abstraction barrier. So many people wouldn't use it because they don't see how it can be useful compared to other data types.

我没有真正得到它,直到我读Chung-chieh Shan的字数的博客系列(part2 part3 part4 )。这就是证明这个想法肯定可以用在实际代码中。

I didn't really get it until I read Chung-chieh Shan's blog series on word numbers (part2, part3, part4). That's proof that the idea can definitely be used in practical code.

在你的情况下,如果你需要检查部分结果并有效追加,使用指尖可能是比建设者好。根据构建器的实现,您可能会在转换为 Text 时,重复执行大量工作,向构建器添加更多内容,转换为文本等等,这取决于你的使用模式。

In your case, if you need to both inspect partial results and have efficient appends, using a fingertree may be better than a builder. Depending on the builder's implementation, you may end up doing a lot of repeated work as you convert to Text, add more stuff to the builder, convert to Text again, etc. It would depend on your usage pattern though.

您可能对我的 splaytree 软件包感兴趣,该软件包提供喷洒具有单一注释的树,并建立了几个不同的结构。除了splay树本身, Set RangeSet 模块具有更多或更少的完整API,序列模块主要是我用于测试的骨架。它不是一个电池包含的解决方案,你正在寻找(再次,@ informatikr的答案提供那些),但是如果你想试验monoidal注释,它可能比 Data.FingerTree 。请注意,如果您按顺序遍历所有元素(或连续snoc到最后或类似的),则播放树可能会失去平衡,但是如果追加和查找是交错的,则性能可以很好。

You might be interested in my splaytree package, which provides splay trees with monoidal annotations, and several different structures build upon them. Other than the splay tree itself, the Set and RangeSet modules have more-or-less complete API's, the Sequence module is mostly a skeleton I used for testing. It's not a "batteries included" solution to what you're looking for (again, @informatikr's answer provides those), but if you want to experiment with monoidal annotations it may be more useful than Data.FingerTree. Be aware that a splay tree can get unbalanced if you traverse all the elements in sequence (or continually snoc onto the end, or similar), but if appends and lookups are interleaved performance can be excellent.

这篇关于为什么FingerTrees没有足够的使用来稳定实现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆