Python:读取12位二进制文​​件 [英] Python: reading 12-bit binary files

查看:92
本文介绍了Python:读取12位二进制文​​件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python 3读取包含图像(视频)的12位二进制文​​件.

I am trying to read 12-bit binary files containing images (a video) using Python 3.

要读取类似文件但以16位编码的文件,以下代码可以很好地工作:

To read a similar file but encoded in 16 bits, the following works very well:

import numpy as np
images = np.memmap(filename_video, dtype=np.uint16, mode='r', shape=(nb_frames, height, width))

其中filename_video是文件,可以从另一个文件读取视频的nb_frames,高度和宽度特征. 工作得很好"是指快速:在计算机上读取具有140帧的640x256视频大约需要1毫秒.

where filename_video is the file and nb_frames, height, and width characteristics of the video that can be read from another file. By 'working very well' I mean fast: reading a 640x256 video that has 140 frames takes about 1 ms on my computer.

据我所知,当文件以12位编码时,由于没有uint12类型,因此无法使用它.所以我想做的是读取一个12位的文件并将其存储在16位的uint数组中.以下内容摘自( Python:读取12位打包的二进制图像 ),工作原理:

As far as I know I cannot use this when the file is encoded in 12 bits because there is no uint12 type. So what I am trying to do is to read a 12-bit file and store it in a 16-bit uint array. The following, taken from (Python: reading 12 bit packed binary image), works:

with open(filename_video, 'rb') as f:
    data=f.read()
images=np.zeros(int(2*len(data)/3),dtype=np.uint16)
ii=0
for jj in range(0,int(len(data))-2,3):
    a=bitstring.Bits(bytes=data[jj:jj+3],length=24)
    images[ii],images[ii+1] = a.unpack('uint:12,uint:12')
    ii=ii+2
images = np.reshape(images,(nb_frames,height,width))

但是,这非常慢:用我的机器读取640x256的视频只有5帧需要大约11.5 s.理想情况下,我希望能够像使用memmap读取8位或16位文件一样有效地读取12位文件.或至少不慢10 ^ 5倍.我怎样才能加快速度?

However, this is very slow: reading a 640x256 video thas has only 5 frames takes about 11.5 s with my machine. Ideally I would like to be able to read 12-bit files as efficiently as I can read 8 or 16-bit files using memmap. Or at least not 10^5 times slower. How could I speed things up ?

这是一个文件示例: http://s000.tinyupload.com/index.php?file_id=26973488795334213426 (nb_frames = 5,高度= 256,宽度= 640).

Here is a file example: http://s000.tinyupload.com/index.php?file_id=26973488795334213426 (nb_frames=5, height=256, width=640).

推荐答案

我的实现与@ max9111提出的实现略有不同,该实现不需要调用unpackbits.

I have a slightly different implementation from the one proposed by @max9111 that doesn't require a call to unpackbits.

它直接通过将中间字节切成两半并使用numpy的二进制运算从三个连续的uint8中创建两个uint12值.在下面,假定data_chunks是一个二进制字符串,其中包含任意数目的12位整数的信息(因此,其长度必须为3的倍数).

It creates two uint12 values from three consecutive uint8 directly by cutting the middle byte in half and using numpy's binary operations. In the following, data_chunks is assumed to be a binary string containing the information for an arbitrary number number of 12-bit integers (hence its length must be a multiple of 3).

def read_uint12(data_chunk):
    data = np.frombuffer(data_chunk, dtype=np.uint8)
    fst_uint8, mid_uint8, lst_uint8 = np.reshape(data, (data.shape[0] // 3, 3)).astype(np.uint16).T
    fst_uint12 = (fst_uint8 << 4) + (mid_uint8 >> 4)
    snd_uint12 = ((mid_uint8 % 16) << 8) + lst_uint8
    return np.reshape(np.concatenate((fst_uint12[:, None], snd_uint12[:, None]), axis=1), 2 * fst_uint12.shape[0])

我以其他实现作为基准,这种方法在大约5 Mb的输入下被证明快了大约4倍:
read_uint12_unpackbits每个循环65.5毫秒±1.11毫秒(平均±标准偏差,共运行7次,每个循环10个循环) read_uint12每个循环14 ms±513 µs(平均±标准偏差,共运行7次,每个循环100个循环)

I benchmarked with the other implementation and this approach proved to be ~4x faster on a ~5 Mb input:
read_uint12_unpackbits 65.5 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) read_uint12 14 ms ± 513 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这篇关于Python:读取12位二进制文​​件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆