解析包含未知长度字段的字节数组 [英] Parsing byte array containg fields of unknown length

查看:142
本文介绍了解析包含未知长度字段的字节数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Java中解析一个具有以下规范的字节数组:

I am parsing in Java a byte array having the following specification:

Trace data format:
    - 4 bytes containing the Id.
    - 4 bytes containing the address.
    - N bytes containing the first name, where 0 < N < 32
    - N bytes containing the last name, where 0 < N < 32
    - 4 bytes containing the Minimum
    - 4 bytes containing the Maximum 
    - 4 bytes containing the Resource Default Level

今天我没有看到解析这个数组的任何解决方案,以便获得具有正确类型的7变量。你确认或者我错过了Java中的魔术函数,允许在字节数组中找到字符串限制(我无法看到最小值如何与其关联的ASCII字符区分开来)。

Today I don't see any solution to parse this array in order to get 7 variable with the correct type. Do you confirm or Am I missing something like a magic function in Java allowing to find String "limits" in a byte array (I can't see how the Minimum value can be distincted from its associated ASCII character).

两个字符串之间是否存在关于特殊字符的约定?

Is there any "convention" about a special character between the 2 strings ?

推荐答案

嗯,你知道第一个名字从字节9开始,而姓氏以字节结尾(lenght-13)。不确定的是如何找到名字的结尾和姓氏的开头。我看到了一些可能的消息:

Well, you know that the first name starts at byte 9, and that the last name ends at byte (lenght-13). What is uncertain is how to find where the first name ends and the last name begins. I see a few possible soutions:


  • 如果格式是由C程序员定义的,那么两个名称字段很可能被null终止byte,因为这是字符串的C约定。

  • 如果它是由Java程序员定义的,它可以由 writeUTF() ,这意味着字节数的规范很可能是错误的。但是,这至少指定了编码,否则这是一个未解决的问题。

  • 如果它是由COBOL程序员定义的,那么这两个字段可以是固定长度的,并用零或空格填充,格式规范列出有效负载长度而不是字段长度。

  • 如果它是由一个真正无能的程序员(无论什么语言)定义的,它包含两个没有分隔符或计数的名称,所以不可能真正地将它们分开(如果你没有这些信息,那么在Java或其他地方就没有神奇功能可以凭空捏造它)。我想你可能希望姓氏总是以大写字母开头,没有人使用双重名字或全部大写字母。

  • If the format was defined by a C programmer, the two name fields are most likely terminated by a null byte, since that's the C convention for strings.
  • If it was defined by a Java programmer, it could be written by writeUTF(), which means that the specification of the byte count is most likely wrong. However, this at least specifies the encoding, which is otherwise an open question.
  • If it was defined by a COBOL programmer, the two fields could be fixed-length and padded with zeroes or spaces, with the format specification listing the payload length rather than the field length.
  • If it was defined by a really incompetent programmer (whatever language), it contains the two names without delimiter or count, so it's not possible to realiably separate them (if you don't have the information, there's no "magic" function in Java or elsewhere that can conjure it out of thin air). I suppose you could hope the last name always starts with an uppercase letter and nobody uses double names or all-caps.

这篇关于解析包含未知长度字段的字节数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆