如何使用javascript计算包含UTF8字符的字节长度? [英] How to calculate byte length containing UTF8 characters using javascript?

查看:98
本文介绍了如何使用javascript计算包含UTF8字符的字节长度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有文本框,用户可以在其中输入ASCII / UTF-8中的字符或两者的组合。在javascript中是否有任何API我们可以计算在文本框中输入的字符的字符串长度。

I have textbox, in which the user can enter the characters in ASCII/UTF-8 or a combination of both. Is there any API in javascript which we can calculate the length of string in bytes for the characters entered in textbox.

如果我输入ascii字符,请说:mystring - the长度将被计算为8.但是当输入UTF8字符时,字符可以是2/3/4字节。

Like if i enter ascii chacter let's say : mystring - the length would be calculated as 8. But when UTF8 characters are entered the characters can be 2/3/4 byte.

让我们说输入的字符:i♥u,the以字节为单位的长度为5.

lets say the character entered : i ♥ u , the length in bytes is 5.

文本框最多可接受31个字符的长度。但如果输入了UTF8字符,则不接受字符串:i♥ui♥ui♥ui♥ui♥u。长度为30。

The textbox can accept max length of 31 characters. But in case if UTF8 characters entered, it will not accept character string : i ♥ u i ♥ u i ♥ u i ♥ u i ♥ u . the length is 30.

我们是否可以限制用户输入不超过31的字符,即使是UTF8字符也是如此。

Can we restrict the user to enter characters not more than 31 even for UTF8 characters.

推荐答案

计算UTF8字节在JavaScript中出现了很多,有点环顾四周,你会找到一些库(这里有一个例子: https://github.com/mathiasbynens/utf8.js )可以提供帮助。我还找到了一个帖子( https://gist.github.com/mathiasbynens/1010324 )专门针对utf8字节计数的解决方案。

Counting UTF8 bytes comes up quite a bit in JavaScript, a bit of looking around and you'll find a number of libraries (here's one example: https://github.com/mathiasbynens/utf8.js) that can help. I also found a thread (https://gist.github.com/mathiasbynens/1010324) full of solutions specifically for utf8 byte counts.

这是该线程中最小,最准确的函数:

Here is the smallest, and most accurate function out of that thread:

function countUtf8Bytes(s){
    var b = 0, i = 0, c
    for(;c=s.charCodeAt(i++);b+=c>>11?3:c>>7?2:1);
    return b
}

注意:I重新排列它以便签名更容易阅读。然而,它仍然是一个非常紧凑的功能,可能很难理解一些。

Note: I rearranged it a bit so that the signature is easier to read. However its still a very compact function that might be hard to understand for some.

您可以使用此工具检查其结果: https://mothereff.in/byte-counter

You can check its results with this tool: https://mothereff.in/byte-counter

对您的OP进行一次更正,您提供的示例字符串 i♥u 实际上是7个字节,此函数确实正确计算。

One correction to your OP, the example string you provided i ♥ u is actually 7 bytes, this function does count it correctly.

这篇关于如何使用javascript计算包含UTF8字符的字节长度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆