从JavaScript字符串中读取字节 [英] Reading bytes from a JavaScript string

查看:688
本文介绍了从JavaScript字符串中读取字节的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在JavaScript中有一个包含二进制数据的字符串。现在我想从中读取一个整数。所以我得到前4个字符,使用 charCodeAt ,做一些移位等等来得到一个整数。

I have a string containing binary data in JavaScript. Now I want to read, for example, an integer from it. So I get the first 4 characters, use charCodeAt, do some shifting, etc. to get an integer.

问题是JavaScript中的字符串是UTF-16(而不是ASCII), charCodeAt 经常返回高于256的值。

The problem is that strings in JavaScript are UTF-16 (instead of ASCII) and charCodeAt often returns values higher than 256.

Mozilla参考指出前128个Unicode代码点是ASCII字符编码的直接匹配。 (那么ASCII值> 128?)。

The Mozilla reference states that "The first 128 Unicode code points are a direct match of the ASCII character encoding." (what about ASCII values > 128?).

如何将 charCodeAt 的结果转换为ASCII值?或者有更好的方法将四个字符的字符串转换为一个4字节的整数吗?

How can I convert the result of charCodeAt to an ASCII value? Or is there a better way to convert a string of four characters to a 4 byte integer?

推荐答案

我相信你可以使用相对简单的位操作执行此操作:

I believe that you can can do this with relatively simple bit operations:

function stringToBytes ( str ) {
  var ch, st, re = [];
  for (var i = 0; i < str.length; i++ ) {
    ch = str.charCodeAt(i);  // get char 
    st = [];                 // set up "stack"
    do {
      st.push( ch & 0xFF );  // push byte to stack
      ch = ch >> 8;          // shift value down by 1 byte
    }  
    while ( ch );
    // add stack contents to result
    // done because chars have "wrong" endianness
    re = re.concat( st.reverse() );
  }
  // return an array of bytes
  return re;
}

stringToBytes( "A\u1242B\u4123C" );  // [65, 18, 66, 66, 65, 35, 67]

它应该是一个简单的事情是通过读取字节数组来将输出相加,就像它是内存并将其加到更大的数字中一样:

It should be a simple matter to sum the output up by reading the byte array as if it were memory and adding it up into larger numbers:

function getIntAt ( arr, offs ) {
  return (arr[offs+0] << 24) +
         (arr[offs+1] << 16) +
         (arr[offs+2] << 8) +
          arr[offs+3];
}

function getWordAt ( arr, offs ) {
  return (arr[offs+0] << 8) +
          arr[offs+1];
}

'\\u' + getWordAt( stringToBytes( "A\u1242" ), 1 ).toString(16);  // "1242"

这篇关于从JavaScript字符串中读取字节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆