将文本转换为Unicode转义序列 [英] Convert Text to Unicode Escape Sequence

查看:85
本文介绍了将文本转换为Unicode转义序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Text 对象,其中包含一些拉丁字符,需要将这些拉丁字符转换为 \ u #### 格式的unicode转义序列,其中#为十六进制数字

I have a Text object that contains some number of Latin characters that needs to be converted to a unicode escape sequence of the format \u#### with # being hex digits

此处中所述,haskell可以轻松地将字符串转换为转义序列,反之亦然.但是,它将仅转到小数表示形式.例如

As described here, haskell easily converts strings to escape sequences and vice versa. However, it will only go to the decimal representation. For example,

> let s = "Ñ"
> s
"\209"

是否有一种方法可以指定转义序列编码以强制其以正确的格式吐出?即

Is there a way to specify the escape sequence encoding to force it to spit out in the correct format? i.e

> let s = encodeUnicode16 "Ñ"
> s
"\u00d1"

推荐答案

这是怎么回事:

import Text.Printf (printf)

encodeUnicode16 :: String -> String
encodeUnicode16 = concatMap escapeChar
  where
    escapeChar c
        | ' ' <= c && c <= 'z' = [c]
        | otherwise =
            printf "\\u%04x" (fromEnum c)

我ghci,您可以按以下方式使用它:

I ghci, you can use it as follows:

> putStrLn $ encodeUnicode16 "Ñ"
\u00d1

请注意,如果您不使用 putStrLn ,它将被转义两次:

Note that if you don't use putStrLn it will get escaped twice:

> encodeUnicode16 "Ñ"
"\\u00d1"

这是因为ghci将在命令前面隐式添加 print .

This is because ghci will implicitly add a print in front of the command.

编辑:我错过了您拥有 Text 而不是 String 的那部分.这是 Text 的相同代码:

Edit: I missed that part that you have a Text and not a String. Here's the same code for Text:

import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Text.Printf (printf)

encodeUnicode16 :: Text -> Text
encodeUnicode16 = T.concatMap escapeChar
  where
    escapeChar c
        | ' ' <= c && c <= 'z' = T.singleton c
        | otherwise =
            T.pack $ printf "\\u%04x" (fromEnum c)

同样,您要使用 T.putStrLn 避免所有内容都双重转义.

Again, you want to use T.putStrLn to avoid double escaping everything.

这篇关于将文本转换为Unicode转义序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆