在Haskell中,如何获取UTF8字符串中的字节数? [英] In Haskell how do I get the number of bytes in a UTF8 string?

查看:71
本文介绍了在Haskell中,如何获取UTF8字符串中的字节数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有UTF8字符串你好雪人☃! 。它有16个字符,占用18个字节。我怎样才能让haskell告诉我该字符串占用的字节数?

Say I have the UTF8 string "Hello Snowman ☃!". It has 16 characters and takes up 18 bytes. How can I get haskell to show me the number of bytes this string takes up?

我尝试使用Data.ByteArray,Data.Text,ByteString并在每种情况下使用我缺席了。

I've tried using Data.ByteArray, Data.Text, ByteString and in each case I have come up short.

推荐答案

您可以使用出色的 utf8-string 软件包。

You can use the excellent utf8-string package for this.

import qualified Data.ByteString as BS
import qualified Data.ByteString.UTF8 as UTF8

numBytesUtf8 :: String -> Int
numBytesUtf8 = BS.length . UTF8.fromString

然后,以您的示例为例,

Then, to use your example,

ghci> numBytesUtf8 "Hello Snowman ☃!"
18

当然,您可能一开始就不应该这样做。 UTF8.fromString BS.length 可能是您要使用的函数,但您的字符串可能应该是已经个字节字符串,您可以对将其编码为多少字节感兴趣。

Of course, you should probably not be doing this in the first place. UTF8.fromString and BS.length are probably the functions you want to use, but your strings probably ought to be already bytestrings for you to be interested in how many bytes it takes to encode them as such.

这篇关于在Haskell中,如何获取UTF8字符串中的字节数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆