System.Directory.getDirectoryContents unicode支持 [英] System.Directory.getDirectoryContents unicode support
问题描述
下面的代码打印出如下所示的内容:°Ð½Ð'иÑ-ÐÑпаниÑ
getDirectoryContentspath / to / directory / that / contains / files / with / nonASCII / names
>> = mapM_ putStrLn
看起来像是一个ghc bug ,它已经存储在存储库中。但是,如果每个人都升级ghc,该怎么办?
上一次我遇到这样的问题时(我几年前,btw),我使用utf8-string包进行转换字符串,但我不记得我是怎么做到的,而且gc unicode的支持在去年显然发生了变化。所以,什么是最好的(或者至少是工作的)
ghc version 7.0.4
locale en_US.UTF-8
/Codec-Binary-UTF8-String.html#v:decodeStringrel =noreferrer> decodeString
和 encodeString
from utf8-string 。
import System.Directory
导入限定Codec.Binary.UTF8.String为UTF8
main = do
getDirectoryContents。 >> = mapM_(putStrLn.UTF8.decodeString)
putStrLn------------
readFile(UTF8.encodeStringbrøken-file-nåme.txt )>> = putStrLn
输出:
。
..
brøken-file-nåme.txt
Broken.hs
------------
hello
The following code prints something like °Ð½Ð´Ð¸Ñ-ÐÑпаниÑ
getDirectoryContents "path/to/directory/that/contains/files/with/nonASCII/names"
>>= mapM_ putStrLn
Looks like it is a ghc bug and it is fixed already in repository. But what to do until everybody upgrade ghc?
The last time I encountered such the problem (it was few years ago, btw), I used utf8-string package to convert strings, but I don't remember how I did it, and ghc unicode support was changed visibly last years.
So, what is the best (or at least working) way to get directory contents with full unicode support?
ghc version 7.0.4 locale en_US.UTF-8
Here's a simple workaround using decodeString
and encodeString
from utf8-string.
import System.Directory
import qualified Codec.Binary.UTF8.String as UTF8
main = do
getDirectoryContents "." >>= mapM_ (putStrLn . UTF8.decodeString)
putStrLn "------------"
readFile (UTF8.encodeString "brøken-file-nåme.txt") >>= putStrLn
Output:
.
..
brøken-file-nåme.txt
Broken.hs
------------
hello
这篇关于System.Directory.getDirectoryContents unicode支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!