Haskell源编码 [英] Haskell source encoding

查看:95
本文介绍了Haskell源编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Haskell 2010语言报告说:

The Haskell 2010 Language Report says:


Haskell使用Unicode [2]字符集。但是,源程序目前偏向早期版本的Haskell中使用的ASCII字符集。
Haskell uses the Unicode [2] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell.

这是否意味着UTF-8?

Does this mean UTF-8?

在ghc-7.0.4 /解析器/ Lexer.x.source:

In ghc-7.0.4/compiler/parser/Lexer.x.source:

$unispace    = \x05 -- Trick Alex into handling Unicode. See alexGetChar.
$whitechar   = [\ \n\r\f\v $unispace]
$white_no_nl = $whitechar # \n
$tab         = \t

$ascdigit  = 0-9
$unidigit  = \x03 -- Trick Alex into handling Unicode. See alexGetChar.
$decdigit  = $ascdigit -- for now, should really be $digit (ToDo)
$digit     = [$ascdigit $unidigit]

$special   = [\(\)\,\;\[\]\`\{\}]
$ascsymbol = [\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~]
$unisymbol = \x04 -- Trick Alex into handling Unicode. See alexGetChar.
$symbol    = [$ascsymbol $unisymbol] # [$special \_\:\"\']

$unilarge  = \x01 -- Trick Alex into handling Unicode. See alexGetChar.
$asclarge  = [A-Z]
$large     = [$asclarge $unilarge]

$unismall  = \x02 -- Trick Alex into handling Unicode. See alexGetChar.
$ascsmall  = [a-z]
$small     = [$ascsmall $unismall \_]

$unigraphic = \x06 -- Trick Alex into handling Unicode. See alexGetChar.
$graphic   = [$small $large $symbol $digit $special $unigraphic \:\"\']

...我不知道该怎么做。 alexGetChar 不是很有帮助。

...I'm not sure what to make of this. alexGetChar wasn't really helpful.

推荐答案

提案将UTF-8标准化为标准编码的Haskell源文件,但我不确定是否被接受。

There was a proposal to standardize on UTF-8 as the standard encoding of Haskell source files, but I'm not sure if it was accepted or not.

在实践中,GHC假定所有输入文件都是UTF-8,但它忽略了格式错误

In practice, GHC assumes all input files are UTF-8, but it ignores malformed byte sequences in comments.

这篇关于Haskell源编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆