将BCD的大型numpy数组转换为十进制 [英] Convert large numpy arrays of BCD to decimal

查看：85 发布时间：2020/5/18 23:30:59 python numpy

本文介绍了将BCD的大型numpy数组转换为十进制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有多个GB范围内的二进制数据文件，我使用numpy进行内存映射.每个数据包的开头都包含一个BCD时间戳.每个十六进制数字都编码为0DDD:HH:MM:SS.sssss的时间格式，我需要将此时间戳转换为当年的总秒数.

I have binary data files in the multiple GB range that I am memory mapping with numpy. The start of each data packet contains a BCD timestamp. Where each hex number is coded into the time format of 0DDD:HH:MM:SS.ssss I need this timestamp turned into total seconds of the current year.

示例: 第一次时间戳0x0261 1511 2604 6002将是:261:15:11:26.046002或

Example: The the first time stamp 0x0261 1511 2604 6002 Would be: 261:15:11:26.046002 or

261*86400 + 15*3600 + 11*60 + 26.046002 =  22551986.046002

当前，我正在执行此操作以计算时间戳:

Currently I am doing this to compute the timestamps:

import numpy as np
rawData  = np.memmap('dataFile.bin',dtype='u1',mode='r') 
#findFrameStart returns the index to the start of each data packet   [0,384,768,...]
fidx = findFrameStart(rawData)

# Do lots of bit shifting and multiplying and type casting....
day1  = ((rawData[fidx  ]>>4)*10 + (rawData[fidx  ]&0x0F)).astype('f8')
day2  = ((rawData[fidx+1]>>4)*10 + (rawData[fidx+1]&0x0F)).astype('f8')
hour  = ((rawData[fidx+2]>>4)*10 + (rawData[fidx+2]&0x0F)).astype('f8')
mins  = ((rawData[fidx+3]>>4)*10 + (rawData[fidx+3]&0x0F)).astype('f8')
sec1  = ((rawData[fidx+4]>>4)*10 + (rawData[fidx+4]&0x0F)).astype('f8')
sec2  = ((rawData[fidx+5]>>4)*10 + (rawData[fidx+5]&0x0F)).astype('f8')
sec3  = ((rawData[fidx+6]>>4)*10 + (rawData[fidx+6]&0x0F)).astype('f8')
sec4  = ((rawData[fidx+7]>>4)*10 + (rawData[fidx+7]&0x0F)).astype('f8')
time  = (day1*100+day2)*86400 + hour*3600 + mins*60 + sec1 + sec2/100 + sec3/10000 + sec4/1000000

请注意，我必须将每个中间变量(day1，day2等)强制转换为两倍，以使time能够正确计算.

Note I had to cast each of the intermediate vars (day1, day2, etc.) to double to get the time to compute correctly.

鉴于框架很多，fidx可能会变得很大(〜10e6个元素或更多).在我当前的方法中，这导致大量的数学运算，移位，转换等.到目前为止，它在较小的测试文件上正常运行(在150MB数据文件上约为180ms).但是，我担心当我遇到一些较大的数据(4-5GB)时，所有中间阵列都可能存在内存问题.

Given that there are lots of frames, fidx can get kind of large (~10e6 elements or more). This results in lots of math operations, bit shifts, casting, etc. in my current method. So far it is working OK on a smaller test file (~180ms on a 150MB data file). However, I am worried about when I hit some larger data(4-5GB) there might be memory issues with all of the intermediate arrays.

因此，如果可能的话，我正在寻找一种可能会缩短一些开销的不同方法.从BCD到十进制的每个字节操作都是相似的，因此看来我应该应该可以对某些内容进行迭代，并可以将数组转换为适当的位置……至少可以减少内存占用.

So if possible I was looking for a different method that might shortcut some of the overhead. The BCD to decimal operations are similar for each byte so it seems I should maybe be able to iterate over something and maybe convert an array in place ... at least reducing the memory footprint.

任何帮助将不胜感激.仅供参考，我正在使用Python 3.7

Any help would be appreciated. FYI, I am using Python 3.7

将BCD的大型numpy数组转换为十进制 [英] Convert large numpy arrays of BCD to decimal

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将BCD的大型numpy数组转换为十进制 [英] Convert large numpy arrays of BCD to decimal

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭