你什么时候使用解压('h *'...)或者打包('h *'...)? [英] When would you use unpack('h*' ...) or pack('h*' ...)?

查看:161
本文介绍了你什么时候使用解压('h *'...)或者打包('h *'...)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Perl中, pack unpack 有两个模板用于将字节转换为十六进制数据:


h    一个十六进制字符串(低nybble第一个)。

H    一个十六进制字符串(首先是高nybble)。

这个例子最好说明一下:

 使用5.010; #所以我可以用
my $ buf =\x12\x34\x56\x78;

表示解包('H *',$ buf); #打印12345678
说解包('h *',$ buf); #print 21436587

正如您所见, H 是人们在考虑将字节转换为/从十六进制时通常表示的意思。那么 h 的目的是什么?拉里一定认为有人可能会使用它,或者他不会打扰包括它。



你可以举一个真实世界的例子吗?实际上希望使用 h 而不是 H unpack 我正在寻找一个特定的例子;如果你知道一台组织它的字节的机器,它是什么,你可以链接到它的一些文档?



我可以想一些例子,你可以使用 h ,例如,只要您可以读取它,但您并不在乎格式是什么,则会序列化一些数据,但 H 对此也同样有用。我正在寻找一个例子,其中 h H 更有用。 p> DOS_APIrel =nofollow> MS-DOS ,通过在寄存器上设置高半字节和低半字节并执行Interupt xx来控制某些OS函数。例如,Int 21访问了许多文件功能。您可以将高半字节设置为驱动器号 - 谁将拥有超过15个驱动器?低位半字节作为该驱动器上的请求功能等。

这里是一些旧的CPAN代码,它使用pack来描述设置寄存器以执行MS-DOS系统调用。

Blech !!!我不会错过MS-DOS ...

- 编辑

这里是特定的源代码代码:下载用于DOS的Perl 5.00402 此处,解压缩,



在Opcode.pm和Opcode.pl文件中,您可以看到 unpack(h *,$ _ [0]);

  sub opset_to_hex($){
return(invalid opset)unless verify_opset($ _ [0] );
解包(h *,$ _ [0]);
}

我没有完全遵循代码,但是我的怀疑是是从MS-DOS系统调用中恢复信息...



perlport for Perl 5.8-8,你有这些建议的目标测试:


不同的CPU以不同的
顺序存储整数和浮点数(称为 endianness )和宽度(32位和64位是
)今日常见)。当他们试图将二进制格式的
数字从一个CPU架构转移到另一个架构时,这会影响您的程序,
通常通过网络连接实时或将
数字存储到二级存储器等作为一个磁盘文件或磁带。



冲突的存储顺序使数字变得混乱。如果
的小端主机(Intel,VAX)在
中存储 0x12345678 305419896 十进制),一个大端主机(Motorola,Sparc,PA)将其读为
0x78563412 ( 2018915346 十进制)。 Alpha和MIPS可以是:
Digital / Compaq在小端模式下使用/使用它们; SGI / Cray以big-endian模式使用
。为避免网络(套接字)
连接中出现此问题,请使用解包格式 n N
网络订单。这些保证是可移植的。

从perl 5.8.5开始,您还可以使用> < 修饰符
强制大或小端的字节顺序。例如,如果您希望
存储带符号整数或64位整数,这非常有用。



您可以通过解包$来探索平台的字节顺序b $ b数据结构以原生格式打包,如:

  print unpack(h *,pack(s2, 1,2)),\\\
;例如
#'10002000'例如在小尾模式下的Intel x86或Alpha 21064
#'00100020'在例如Motorola 68040

如果您需要区分端元架构,您可以使用
变量像这样设置:

  $ is_big_endian = unpack(h *,pack(s,1))=〜/ 01 /; 
$ is_little_endian = unpack(h *,pack(s,1))=〜/ ^ 1 /;

即使在平等的
endianness平台之间,不同的宽度也会导致截断。较短宽度的平台会损失
数字的上半部分。除了避免
传输或存储原始二进制数外,没有什么好的解决方案。

可以用两种方法绕过这两个问题。
传输和存储数字总是以文本格式,而不是原始的
binary,或者考虑使用像 Data :: Dumper 这样的模块(包含在
是Perl 5.005以来的标准发行版)和 Storable (包含在
perl 5.8中)。将所有数据保存为文本显着简化了事情。


v-字符串只能移动到 v2147483647 0x7FFFFFFF ),那就是
EBCDIC,或者更确切地说UTF-EBCDIC将会走多远。

看起来, unpack(h *,...) pack(h *, ...)。我确实注意到返回qq'unpack(F,pack(h *,$ hex))'; 用于 Deparse.pm IO-Compress 使用 pack(* h,...) in Perl 5.12



如果您需要更多示例,请参阅 Google搜索> Google搜索> Google搜索> Google代码搜索> Google搜索> Google搜索> Google代码搜索> Google搜索> Google搜索> Google搜索> Google搜索> Google代码搜索列表。你可以看到 pack | unpack(h *...)相当少见,主要与确定平台永久性有关......


In Perl, pack and unpack have two templates for converting bytes to/from hex:

h    A hex string (low nybble first).
H    A hex string (high nybble first).

This is best clarified with an example:

use 5.010; # so I can use say
my $buf = "\x12\x34\x56\x78";

say unpack('H*', $buf); # prints 12345678
say unpack('h*', $buf); # prints 21436587

As you can see, H is what people generally mean when they think about converting bytes to/from hexadecimal. So what's the purpose of h? Larry must have thought someone might use it, or he wouldn't have bothered to include it.

Can you give a real-world example where you'd actually want to use h instead of H with pack or unpack? I'm looking for a specific example; if you know of a machine that organized its bytes like that, what was it, and can you link to some documentation on it?

I can think of examples where you could use h, such as serializing some data when you don't really care what the format is, as long as you can read it back, but H would be just as useful for that. I'm looking for an example where h is more useful than H.

解决方案

Recall in the bad 'ole days of MS-DOS that certain OS functions were controlled by setting high nibble and low nibbles on a register and performing an Interupt xx. For example, Int 21 accessed many file functions. You would set the high nibble as the drive number -- who will have more than 15 drives?? The low nibble as the requested function on that drive, etc.

Here is some old CPAN code that uses pack as you describe to set the registers to perform an MS-DOS system call.

Blech!!! I don't miss MS-DOS at all...

--Edit

Here is specific source code: Download Perl 5.00402 for DOS HERE, unzip,

In file Opcode.pm and Opcode.pl you see the use of unpack("h*",$_[0]); here:

sub opset_to_hex ($) {
    return "(invalid opset)" unless verify_opset($_[0]);
    unpack("h*",$_[0]);
}

I did not follow the code all the way through, but my suspicion is this is to recover info from an MS-DOS system call...

In perlport for Perl 5.8-8, you have these suggested tests for endianess of the target:

Different CPUs store integers and floating point numbers in different orders (called endianness) and widths (32-bit and 64-bit being the most common today). This affects your programs when they attempt to transfer numbers in binary format from one CPU architecture to another, usually either "live" via network connection, or by storing the numbers to secondary storage such as a disk file or tape.

Conflicting storage orders make utter mess out of the numbers. If a little-endian host (Intel, VAX) stores 0x12345678 (305419896 in decimal), a big-endian host (Motorola, Sparc, PA) reads it as 0x78563412 (2018915346 in decimal). Alpha and MIPS can be either: Digital/Compaq used/uses them in little-endian mode; SGI/Cray uses them in big-endian mode. To avoid this problem in network (socket) connections use the pack and unpack formats n and N, the "network" orders. These are guaranteed to be portable.

As of perl 5.8.5, you can also use the > and < modifiers to force big- or little-endian byte-order. This is useful if you want to store signed integers or 64-bit integers, for example.

You can explore the endianness of your platform by unpacking a data structure packed in native format such as:

   print unpack("h*", pack("s2", 1, 2)), "\n";
   # '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode
   # '00100020' on e.g. Motorola 68040

If you need to distinguish between endian architectures you could use either of the variables set like so:

   $is_big_endian    = unpack("h*", pack("s", 1)) =~ /01/;
   $is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/;

Differing widths can cause truncation even between platforms of equal endianness. The platform of shorter width loses the upper parts of the number. There is no good solution for this problem except to avoid transferring or storing raw binary numbers.

One can circumnavigate both these problems in two ways. Either transfer and store numbers always in text format, instead of raw binary, or else consider using modules like Data::Dumper (included in the standard distribution as of Perl 5.005) and Storable (included as of perl 5.8). Keeping all data as text significantly simplifies matters.

The v-strings are portable only up to v2147483647 (0x7FFFFFFF), that's how far EBCDIC, or more precisely UTF-EBCDIC will go.

It seems that unpack("h*",...) is used more often than pack("h*",...). I did note that return qq'unpack("F", pack("h*", "$hex"))'; is used in Deparse.pm and IO-Compress uses pack("*h",...) in Perl 5.12

If you want further examples, here is a Google Code Search list. You can see pack|unpack("h*"...) is fairly rare and mostly to do with determining platform endianess...

这篇关于你什么时候使用解压('h *'...)或者打包('h *'...)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆