为什么 Haskell 的默认字符串实现是一个字符链表? [英] Why is Haskell's default string implementation a linked list of chars?

查看:12
本文介绍了为什么 Haskell 的默认字符串实现是一个字符链表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

众所周知,Haskell 的默认 String 实现在速度和内存方面都不是很有效.据我所知,[] 列表 通常在 Haskell 中实现为单链表,对于大多数小/简单数据类型(例如 Int),它没有看起来是个好主意,但对于 String 来说,这似乎完全是矫枉过正.关于此事的一些意见包括:

The fact that Haskell's default String implementation is not efficient both in terms of speed and memory is well known. As far as I know the [] lists in general are implemented in Haskell as singly-linked lists and for most small/simple data types (e.g. Int) it doesn't seem like a very good idea, but for String it seems like total overkill. Some of the opinions on this matter include:

真实世界 Haskell

在这样的简单基准测试中,即使是用 Python 等解释型语言编写的程序也能比使用 String 的 Haskell 代码高一个数量级.

On simple benchmarks like this, even programs written in interpreted languages such as Python can outperform Haskell code that uses String by an order of magnitude.

Haskell 中的高效字符串实现

由于String只是[Char],即Char的链表,这意味着Strings的引用局部性较差,再次意味着Strings在内存中相当大,至少是N * (21bits + Mbits)) 其中 N 是字符串的长度,M 是指针的大小 (...).字符串不太可能被编译器优化为循环等.

Since a String is just [Char], that is a linked list of Char, it means Strings have poor locality of reference, and again means that Strings are fairly large in memory, at a minimum it's N * (21bits + Mbits) where N is the length of the string and M is the size of a pointer (...). Strings are much less likely to be able to be optimized to loops, etc. by the compiler.

我知道 Haskell 有几种不错的 ByteStrings(和 Arrays),并且它们可以很好地完成工作,但我希望默认实现能够成为最有效率的.

I know that Haskell has ByteStrings (and Arrays) in several nice flavors and that they can do the job nicely, but I would expect the default implementation to be the most efficient one.

TL;DR:为什么 Haskell 的默认 String 实现是一个单向链表,即使它非常低效并且很少用于现实世界的应用程序(除了非常简单的应用程序)?有历史原因吗?是否更容易实施?

TL;DR: Why is Haskell's default String implementation a singly-linked list even though it is terribly inefficient and rarely used for real world applications (except for the really simple ones)? Are there historical reasons? Is it easier to implement?

推荐答案

为什么 Haskell 的默认 String 实现是单向链表

Why is Haskell's default String implementation a singly-linked list

因为支持单链表:

  • 通过模式匹配进行归纳
  • 具有有用的属性,例如 Monad、Functor
  • 适当的参数化多态
  • 天生懒惰

String as [Char](unicode 点)表示符合语言目标的字符串类型(截至 1990 年),并且本质上是免费的"与列表库.

and so String as [Char] (unicode points) means a string type that fits the language goals (as of 1990), and essentially come "for free" with the list library.

总而言之,历史上语言设计者对精心设计的核心数据类型更感兴趣,而不是现代文本处理问题,所以我们有一个优雅、易于理解、易于教授的String类型,它不完全是一个 Unicode 文本块,也不是一个密集的、压缩的、严格的数据类型.

In summary, historically the language designers were interested more in well-designed core data types, than the modern problems of text processing, so we have an elegant, easy to understand, easy to teach String type, that isn't quite a unicode text chunk, and isn't a dense, packed, strict data type.

这篇关于为什么 Haskell 的默认字符串实现是一个字符链表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆