GHC截断Unicode字符输出 [英] GHC truncating Unicode character output

查看:114
本文介绍了GHC截断Unicode字符输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法让GHCi或GHC打印unicode代码点221A(sqrt符号:√)。

我不认为这是我的shell,因为我可以得到红宝石做到这一点:

  irb>放置\\\√

GHC / GHCi是另一个问题:

  ghci> putStrLn\8730

ghci> withFiletemp.outWriteMode $ flip hPutStrLn\8730
ghci> readFiletemp.out
\SUB\\\

我做错了?



(GHC v6.l0.3)

解决方案

GHC的行为与unicode 在GHC 6.12.1 中更改为使用Unicode字符串做正确的事情。先前版本在IO上截断为8位字符(强制使用编码库)。

即,'\8730'是0x221a,而'\SUB '是0x1a - 高字节没有了。



GHC 7:

 前奏>打印√\\\

\8730\\\

Prelude> putStr√\\\


Prelude> putStr\8730√\

√√

但是我得到了你的结果与GHC 6.8。像这样:

  Prelude> writeFile/ tmp / x√\\\

Prelude> readFile/ tmp / x
\SUB\\\



GHC 7 + IO按预期工作:

 前奏> writeFile/ tmp / x\8730√\
Prelude> readFile/ tmp / x
\8730\8730\\\

Prelude> s< - readFile/ tmp / x
Prelude> putStr s
√√

您可以升级到GHC 7(在 Haskell Platform )获得完整的Unicode支持?如果这不可行,您可以使用其中一个编码库,例如 utf8-string


I can't get GHCi or GHC to print unicode codepoint 221A (sqrt symbol: √).

I don't think it's my shell, because I can get ruby to do it:

irb> puts "\u221A"
√

GHC/GHCi is another issue:

ghci> putStrLn "\8730"

ghci> withFile "temp.out" WriteMode $ flip hPutStrLn "\8730"
ghci> readFile "temp.out"
"\SUB\n"

So what am I doing wrong?

(GHC v6.l0.3)

解决方案

GHC's behavior with unicode changed in GHC 6.12.1 to "do the right thing" with Unicode strings. Prior versions truncate to 8 bit characters on IO (forcing the use of an encoding library).

That is, '\8730' is 0x221a, while '\SUB' is 0x1a -- the high byte is gone.

Here with GHC 7:

Prelude> print "√\n"
"\8730\n"
Prelude> putStr "√\n"
√
Prelude> putStr "\8730√\n"
√√

But I get your result with GHC 6.8. Like this:

Prelude> writeFile "/tmp/x" "√\n"
Prelude> readFile "/tmp/x"
"\SUB\n"

as the unicode bits are being truncated to 8 bits.

GHC 7 + IO works as expected:

Prelude> writeFile "/tmp/x" "\8730√\n"
Prelude> readFile "/tmp/x"
"\8730\8730\n"
Prelude> s <- readFile "/tmp/x"
Prelude> putStr s
√√

Can you upgrade to GHC 7 (in the Haskell Platform) to get full Unicode support? If this is not possible, you can use one of the encoding libraries, such as utf8-string

这篇关于GHC截断Unicode字符输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆