如何解析/编码二进制消息格式? [英] How to parse/encode binary message formats?

查看:107
本文介绍了如何解析/编码二进制消息格式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要解析并编码为Java中的旧二进制消息格式。我开始使用DataOutputStream来读取/写入基本类型,但我遇到的问题是消息格式与字节偏移并不完全对齐并包括位标志。

I need to parse and encode to a legacy binary message format in Java. I began by using DataOutputStream to read/write primitive types but the problem I'm having is that the message format doesn't align nicely to byte offsets and includes bit flags.

例如,我必须处理这样的消息:

For example I have to deal with messages like this:

+----------+---+---+----------+---------+--------------+
+uint32    +b   +b + uint32   +4bit enum+32 byte string+
+----------+---+---+----------+---------+--------------+

其中(b)是一位标志。问题是java原语类型没有与字节边界对齐,所以我无法使用DataOutputStream对其进行编码,因为我可以写的最低级别类型是一个字节。

Where (b) is a one bit flag. The problem being that java primitive types don't align to byte boundaries so I wouldn't be able to use DataOutputStream to encode this since the lowest level type I can write is a byte.

是否有任何库,标准或第三方处理任意位级消息格式?

Are there any libraries, standard or 3rd party, for dealing with arbitrary bit level message formats?

编辑:
感谢@Software Monkey for迫使我更仔细地看看我的规范。我使用的规范确实在字节边界上对齐,因此DataOutputStream是合适的。鉴于我原来的问题,虽然我会选择@emboss提出的解决方案。

Thanks to @Software Monkey for forcing me to look at my spec more closely. The spec I am using does actually align on byte boundaries so DataOutputStream is appropriate. Given my original question though I would have gone with the solution proposed by @emboss.

编辑:
虽然发现这个问题的消息格式是在字节边界上但是我遇到了另一种适用于原始问题的消息格式。这种格式定义了一个6位字符映射,其中每个字符实际上只占用6位,而不是整个字节,因此字符串不在字节边界上对齐。我发现了几个解决这个问题的二进制输出流。像这样: http://introcs.cs.princeton.edu/java /stdlib/BinaryOut.java.html

推荐答案

内置字节键入Java,你可以使用 byte [] 缓冲区/6/docs/api/java/io/InputStream.html#read%28byte[]%29\"rel =noreferrer> InputStream #read(byte [])并使用 OutputStream#写(byte [],int,int),所以没有问题。

There is a builtin byte type in Java, and you can read into byte[] buffers just fine using InputStream#read(byte[]) and write to an OutputStream using OutputStream#write(byte[], int, int), so there's no problem in that.

关于你的消息 - 正如你所说的那样,最微小的一点您一次获得的信息是一个字节,因此您必须先将消息格式分解为8位块:

Regarding your messages - as you noted correctly, the tiniest bit of information you get at a time is a byte, so you will have to decompose your message format into 8 bit chunks first:

假设您的消息位于字节[]中命名为d ATA。我也假设小端。

Let's suppose your message is in a byte[] named data. I also assume little-endianness.

uint32是32位长 - >那是4个字节。 (在Java中解析这个问题时要小心,Java整数和长整数都已签名,你需要处理它。一个避免麻烦的简单方法就是需要很长时间。数据[0]填充位31 - 24,数据[1] 23 - 16,数据[2]位15 - 8和数据[3]位7到0.所以你需要将它们适当地向左移动并用逻辑OR将它们粘合在一起:

A uint32 is 32 bits long -> that's four bytes. (Be careful when parsing this in Java, Java integers and longs are signed, you need to handle that. An easy way to avoid trouble would be taking longs for that. data[0] fills bits 31 - 24, data[1] 23 - 16, data[2] bits 15 - 8 and data[3] bits 7 to 0. So you need to shift them appropriately to the left and glue them together with logical OR:

long uint32 = ((data[0]&0xFF) << 24) | 
              ((data[1]&0xFF) << 16) | 
              ((data[2]&0xFF) << 8)  | 
               (data[3]&0xFF);

接下来,有两个单位。我想你必须检查它们是开(1)还是关(0)。这样做,你使用位掩码并将你的字节与逻辑AND进行比较。

Next, there are two single bits. I suppose you have to check whether they are "on" (1) or "off" (0). To do this, you use bit masks and compare your byte with logical AND.

第一位:(二进制掩码| 1 0 0 0 0 0 0 0 | = 128 = 0x80 )

First bit: ( binary mask | 1 0 0 0 0 0 0 0 | = 128 = 0x80 )

if ( (data[4] & 0x80 ) == 0x80 ) // on

第二位:(二元掩码| 0 1 0 0 0 0 0 0 | = 64 = 0x40)

Second bit: ( binary mask | 0 1 0 0 0 0 0 0 | = 64 = 0x40 )

if ( (data[4] & 0x40 ) == 0x40 ) // on

要编写下一个uint32,您必须在基础数据的字节边界上组合字节。例如。对于第一个字节,取剩下的6位数据[4],将它们向左移动两个(它们将是uint32的第8位到第2位)并通过移位添加第一个(最高的)两个数据[5]它们在右边6位(它们将占用uint32的剩余1和0槽)。 添加表示逻辑OR'ing:

To compose the next uint32, you will have to compose bytes over byte boundaries of the underlying data. E.g. for the first byte take the remaining 6 bits of data[4], shift them two to the left (they will be bit 8 to 2 of the uint32) and "add" the first (highest) two of data[5] by shifting them 6 bits to the right (they will take the remaining 1 and 0 slot of the uint32). "Adding" means logically OR'ing:

byte uint32Byte1 = (byte)( (data[4]&0xFF) << 2 | (data[5]&&0xFF) >> 6);

构建uint32的过程与第一个示例相同。依此类推。

Building your uint32 is then the same procedure as in the first example. And so on and so forth.

这篇关于如何解析/编码二进制消息格式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆