如何在保留矩阵维度的同时序列化 numpy 数组? [英] How can I serialize a numpy array while preserving matrix dimensions?
问题描述
numpy.array.tostring
似乎没有保留有关矩阵维度的信息(请参阅这个问题),要求用户调用 numpy.array.reshape
.
numpy.array.tostring
doesn't seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.reshape
.
有没有办法在保留此信息的同时将 numpy 数组序列化为 JSON 格式?
Is there a way to serialize a numpy array to JSON format while preserving this information?
注意: 数组可能包含整数、浮点数或布尔值.期望转置数组是合理的.
Note: The arrays may contain ints, floats or bools. It's reasonable to expect a transposed array.
注意 2:这样做的目的是使用 streamparse 通过 Storm 拓扑传递 numpy 数组,以防此类信息最终相关.
Note 2: this is being done with the intent of passing the numpy array through a Storm topology using streamparse, in case such information ends up being relevant.
推荐答案
pickle.dumps
或 numpy.save
编码重建任意 NumPy 数组所需的所有信息,即使存在字节序问题、非连续数组或奇怪的元组数据类型.字节序问题可能是最重要的.您不希望 array([1])
突然变成 array([16777216])
,因为您将数组加载到大端机器上.pickle
可能是更方便的选项,尽管 save
有它自己的好处,在 npy
格式原理.
pickle.dumps
or numpy.save
encode all the information needed to reconstruct an arbitrary NumPy array, even in the presence of endianness issues, non-contiguous arrays, or weird tuple dtypes. Endianness issues are probably the most important; you don't want array([1])
to suddenly become array([16777216])
because you loaded your array on a big-endian machine. pickle
is probably the more convenient option, though save
has its own benefits, given in the npy
format rationale.
pickle
选项:
import pickle
a = # some NumPy array
serialized = pickle.dumps(a, protocol=0) # protocol 0 is printable ASCII
deserialized_a = pickle.loads(serialized)
numpy.save
使用二进制格式,它需要写入文件,但您可以使用 io.BytesIO
来解决这个问题:
numpy.save
uses a binary format, and it needs to write to a file, but you can get around that with io.BytesIO
:
a = # any NumPy array
memfile = io.BytesIO()
numpy.save(memfile, a)
memfile.seek(0)
serialized = json.dumps(memfile.read().decode('latin-1'))
# latin-1 maps byte n to unicode code point n
并反序列化:
memfile = io.BytesIO()
memfile.write(json.loads(serialized).encode('latin-1'))
memfile.seek(0)
a = numpy.load(memfile)
这篇关于如何在保留矩阵维度的同时序列化 numpy 数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!