使用ASCII文本标题二进制输入,从标准输入读取 [英] binary input with an ASCII text header, read from stdin

查看:150
本文介绍了使用ASCII文本标题二进制输入,从标准输入读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想读一个二进制 PNM 图像从标准输入文件。该文件包含一个头是连接codeD为ASCII文本,以及有效载荷是二进制的。作为读取头一个简单的例子,我已经创建了下面的代码片段:

I want to read a binary PNM image file from stdin. The file contains a header which is encoded as ASCII text, and a payload which is binary. As a simplified example of reading the header, I have created the following snippet:

#! /usr/bin/env python3
import sys
header = sys.stdin.readline()
print("header=["+header.strip()+"]")

我运行为test.py(从Bash shell中),它在这种情况下正常工作:

I run it as "test.py" (from a Bash shell), and it works fine in this case:

$ printf "P5 1 1 255\n\x41" |./test.py 
header=[P5 1 1 255]

然而,在二进制负载一个小的变化打破它:

However, a small change in the binary payload breaks it:

$ printf "P5 1 1 255\n\x81" |./test.py 
Traceback (most recent call last):
  File "./test.py", line 3, in <module>
    header = sys.stdin.readline()
  File "/usr/lib/python3.4/codecs.py", line 313, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 11: invalid start byte

有一个简单的方法,使在Python 3这个工作呢?

Is there an easy way to make this work in Python 3?

推荐答案

的文档,就可以从 sys.stdin.buffer.read()字节) C $ C>:

From the docs, it is possible to read binary data (as type bytes) from stdin with sys.stdin.buffer.read():

要写入或读取二进制数据从/到标准流,使用
  底层二进制缓冲区对象。例如,要写入的字节到
  标准输出,使用sys.stdout.buffer.write(b'abc')

To write or read binary data from/to the standard streams, use the underlying binary buffer object. For example, to write bytes to stdout, use sys.stdout.buffer.write(b'abc').

所以这是,你可以采取一个方向 - 读二进制模式下的数据。 的ReadLine()和其他各种功能仍然有效。一旦你捕获的ASCII字符串,它可以被转换成文本,使用德code('ASCII码'),附加特定的文本处理。

So this is one direction that you can take -- read the data in binary mode. readline() and various other functions still work. Once you have captured the ASCII string, it can be converted to text, using decode('ASCII'), for additional text-specific processing.

另外,你可以使用 io.TextIOWrapper()来表示了拉丁-1 字符集上的输入流。与此相关,隐德code操作实际上都是直通操作 - 这样的数据将类型 STR (其中重present文本),但数据被再次与从二进制1对1映射(虽然也可以使用每个输入字节以上的存储字节psented $ p $)。

Alternatively, you can use io.TextIOWrapper() to indicate the use of the latin-1 character set on the input stream. With this, the implicit decode operation will essentially be a pass-through operation -- so the data will be of type str (which represent text), but the data is represented with a 1-to-1 mapping from the binary (although it could be using more than one storage byte per input byte).

下面是code这两种模式下工作:

Here's code that works in either mode:

#! /usr/bin/python3

import sys, io

BINARY=True ## either way works

if BINARY: istream = sys.stdin.buffer
else:      istream = io.TextIOWrapper(sys.stdin.buffer,encoding='latin-1')

header = istream.readline()
if BINARY: header = header.decode('ASCII')
print("header=["+header.strip()+"]")

payload = istream.read()
print("len="+str(len(payload)))
for i in payload: print( i if BINARY else ord(i) )

测试每一个可能的1个像素的有效载荷具有以下bash命令:

Test every possible 1-pixel payload with the following Bash command:

for i in $(seq 0 255) ; do printf "P5 1 1 255\n\x$(printf %02x $i)" |./test.py ; done

这篇关于使用ASCII文本标题二进制输入,从标准输入读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆