在Haskell中使用UTF-8作为IO字符串读取文件 [英] Read file with UTF-8 in Haskell as IO String

查看:43
本文介绍了在Haskell中使用UTF-8作为IO字符串读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码可以正常工作,除非文件具有 utf-8 个字符:

I have the following code which works fine unless the file has utf-8 characteres :

module Main where
import Ref
main = do
    text <- getLine
    theInput <- readFile text
    writeFile ("a"++text) (unlist . proc . lines $ theInput)

使用utf-8字符,我得到以下信息: hGetContents:无效的参数(无效的字节序列)

With utf-8 characteres I get this: hGetContents: invalid argument (invalid byte sequence)

由于我正在使用的文件具有 UTF-8 字符,因此我想处理此异常,以便在可能的情况下重用从 Ref 导入的功能.

Since the file I'm working with has UTF-8 characters, I would like to handle this exception in order to reuse the functions imported from Ref if possible.

是否可以将 UTF-8 文件读取为 IO String ,以便我可以重用 Ref 的功能?我应该对我的代码进行哪些修改?预先感谢.

Is there a way to read a UTF-8 file as IO String so I can reuse my Ref's functions?. What modifications should I make to my code?. Thanks in Advance.

我从我的 Ref 模块附加函数声明:

I attach the functions declarations from my Ref module:

unlist :: [String] -> String
proc :: [String] -> [String]

从前奏:

lines :: String -> [String]

推荐答案

感谢您的回答,但我自己找到了解决方案.实际上,我正在使用的文件具有以下编码:

Thanks for the answers, but I found the solution by myself. Actually the file I was working with has this codification:

ISO-8859 text, with CR line terminators

因此要使用我的haskell代码处理该文件,应改用以下代码:

So to work with that file with my haskell code It should have this codification instead:

UTF-8 Unicode text, with CR line terminators

您可以使用实用程序 file 来检查文件编码,如下所示:

You can check the file codification with the utility file like this:

$ file filename

要更改文件编码,请遵循此链接

To change the file codification follow the instructions from this link!

这篇关于在Haskell中使用UTF-8作为IO字符串读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆