如何在保留矩阵维度的同时序列化 numpy 数组? [英] How can I serialize a numpy array while preserving matrix dimensions?

查看:30
本文介绍了如何在保留矩阵维度的同时序列化 numpy 数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

numpy.array.tostring 似乎没有保留有关矩阵维度的信息(请参阅这个问题),要求用户调用 numpy.array.reshape.

numpy.array.tostring doesn't seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.reshape.

有没有办法在保留此信息的同时将 numpy 数组序列化为 JSON 格式?

Is there a way to serialize a numpy array to JSON format while preserving this information?

注意: 数组可能包含整数、浮点数或布尔值.期望转置数组是合理的.

Note: The arrays may contain ints, floats or bools. It's reasonable to expect a transposed array.

注意 2:这样做的目的是使用 streamparse 通过 Storm 拓扑传递 numpy 数组,以防此类信息最终相关.

Note 2: this is being done with the intent of passing the numpy array through a Storm topology using streamparse, in case such information ends up being relevant.

推荐答案

pickle.dumpsnumpy.save 编码重建任意 NumPy 数组所需的所有信息,即使存在字节序问题、非连续数组或奇怪的元组数据类型.字节序问题可能是最重要的.您不希望 array([1]) 突然变成 array([16777216]),因为您将数组加载到大端机器上.pickle 可能是更方便的选项,尽管 save 有它自己的好处,在 npy 格式原理.

pickle.dumps or numpy.save encode all the information needed to reconstruct an arbitrary NumPy array, even in the presence of endianness issues, non-contiguous arrays, or weird tuple dtypes. Endianness issues are probably the most important; you don't want array([1]) to suddenly become array([16777216]) because you loaded your array on a big-endian machine. pickle is probably the more convenient option, though save has its own benefits, given in the npy format rationale.

pickle 选项:

import pickle
a = # some NumPy array
serialized = pickle.dumps(a, protocol=0) # protocol 0 is printable ASCII
deserialized_a = pickle.loads(serialized)

numpy.save 使用二进制格式,它需要写入文件,但您可以使用 io.BytesIO 来解决这个问题:

numpy.save uses a binary format, and it needs to write to a file, but you can get around that with io.BytesIO:

a = # any NumPy array
memfile = io.BytesIO()
numpy.save(memfile, a)
memfile.seek(0)
serialized = json.dumps(memfile.read().decode('latin-1'))
# latin-1 maps byte n to unicode code point n

并反序列化:

memfile = io.BytesIO()
memfile.write(json.loads(serialized).encode('latin-1'))
memfile.seek(0)
a = numpy.load(memfile)

这篇关于如何在保留矩阵维度的同时序列化 numpy 数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆