EBCDIC十六进制字符串的Python字节表示形式 [英] Python byte representation of a hex string that is EBCDIC

查看:322
本文介绍了EBCDIC十六进制字符串的Python字节表示形式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个十六进制字符串:

I have a string in hex:

Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'

作为字节对象,它看起来像这样:

As a byte object it looks like this:

b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'b'\xe3\x88\x85@\x83'b'\xe3\x88'b'\xe3\x88\x85@\x83\x96\x94\x97\xa4'

在EBCDIC中是这样的:

In EBCDIC it is this:

The computer has rebooted from a bugcheck.

所以我知道十六进制40(x40)在EBCDIC中是一个空格",而在ASCII中是一个"@"

So I know that hex 40 (x40) is a 'space' in EBCDIC and its a '@' in ASCII

我不知道为什么python在打印字节对象时会打印'@'而不是'\ x40'

I can't figure why python, when printing the byte objects, prints '@' instead of '\x40'

我的测试代码示例是:

import codecs
Hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'

output = []
DDF = [4,9,4,9,5,2,9]
distance = 0

# This breaks my hex string into chunks based off the list 'DDF'
for x in DDF:
    output.append(Hex[distance:x*2+distance])
    distance += x*2

#This prints out the list of hex strings
for x in output:
    print(x)

#This prints out they byte objects in the list
for x in output:
    x = codecs.decode(x, "hex")
    print(x)

#The next line print the correct text
Hex = codecs.decode(Hex, "hex")
print(codecs.decode(Hex, 'cp1140'))

上面的输出是:

E3888540
83969497A4A3859940
8881A240
9985829696A3858440
8699969440
8140
82A48783888583924B
b'\xe3\x88\x85@'
b'\x83\x96\x94\x97\xa4\xa3\x85\x99@'
b'\x88\x81\xa2@'
b'\x99\x85\x82\x96\x96\xa3\x85\x84@'
b'\x86\x99\x96\x94@'
b'\x81@'
b'\x82\xa4\x87\x83\x88\x85\x83\x92K'
The computer has rebooted from a bugcheck.

所以我想我的问题是如何让python将字节对象打印为'x40'而不是'@'

So I guess my question is how can I get python to print the byte object as 'x40' instead of '@'

非常感谢您的帮助:)

推荐答案

我认为您的字节数组略有偏移.

I think your byte array is slightly off.

根据,您需要使用"cp500"进行解码,例如:

According to this, you need to use 'cp500' for decoding, example:

my_string_in_hex = 'E388854083969497A4A38599408881A2409985829696A38584408699969440814082A48783888583924B'
my_bytes = bytearray.fromhex(my_string_in_hex)
print(my_bytes)

my_string = my_bytes.decode('cp500')
print(my_string)

输出:

bytearray(b'\xe3\x88\x85@\x83\x96\x94\x97\xa4\xa3\x85\x99@\x88\x81\xa2@\x99\x85\x82\x96\x96\xa3\x85\x84@\x86\x99\x96\x94@\x81@\x82\xa4\x87\x83\x88\x85\x83\x92K')
The computer has rebooted from a bugcheck.

当您打印字节数组时,它仍然会打印'@',但是实际上是\ x40在幕后".这只是对象的 __repr__() .由于此方法未使用任何解码"参数来对其进行正确解码,因此它仅创建一个可读"字符串以用于打印.

When you print the bytearray, it will still print a '@', however it is actuall \x40 "under the covers". This is just the __repr__() of the object. As this method is not taking any "decode" parameter to decode it properly, it just creates a "readable" string for printing purposes.

__repr__()repr()为"正好";它只是"对象的表示形式",而不是实际的对象.这并不意味着它实际上是一个"@".我只是在打印时使用那个字符.它仍然是字节数组,而不是字符串.

__repr__() or repr() is "just that"; it is only a "representation of the object" not the actual object. This does not mean it is actually a '@'. I just uses that character when printing. It is still a bytearray, not a string.

解码时,它将使用所选的代码页正确解码.

When decoding it will properly decode, using the code-page selected.

这篇关于EBCDIC十六进制字符串的Python字节表示形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆