如何在保留矩阵尺寸的同时序列化一个numpy数组? [英] How can I serialize a numpy array while preserving matrix dimensions?
问题描述
numpy.array.tostring
似乎没有保留有关矩阵尺寸的信息(请参阅此问题),要求用户执行以下操作:向numpy.array.reshape
发出呼叫.
numpy.array.tostring
doesn't seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.reshape
.
是否有一种方法可以在保留此信息的同时将numpy数组序列化为JSON格式?
Is there a way to serialize a numpy array to JSON format while preserving this information?
注意:数组可能包含整数,浮点数或布尔值.期望转置数组是合理的.
Note: The arrays may contain ints, floats or bools. It's reasonable to expect a transposed array.
注释2:这样做是为了使numpy数组使用streamparse通过Storm拓扑传递,以防万一此类信息最终变得有意义.
Note 2: this is being done with the intent of passing the numpy array through a Storm topology using streamparse, in case such information ends up being relevant.
推荐答案
pickle.dumps
或 numpy.save
编码所有信息即使存在字节序问题,非连续数组或怪异的元组dtype,也需要重建任意NumPy数组.字节序问题可能是最重要的;您不希望array([1])
突然变成array([16777216])
,因为您是在大型字节序的计算机上加载了阵列.尽管save
有其自身的优点,但pickle
可能是更方便的选择,如
pickle.dumps
or numpy.save
encode all the information needed to reconstruct an arbitrary NumPy array, even in the presence of endianness issues, non-contiguous arrays, or weird tuple dtypes. Endianness issues are probably the most important; you don't want array([1])
to suddenly become array([16777216])
because you loaded your array on a big-endian machine. pickle
is probably the more convenient option, though save
has its own benefits, given in the npy
format rationale.
pickle
选项:
import pickle
a = # some NumPy array
serialized = pickle.dumps(a, protocol=0) # protocol 0 is printable ASCII
deserialized_a = pickle.loads(serialized)
numpy.save
使用二进制格式,它需要写入文件,但是您可以使用StringIO
来解决该问题:
numpy.save
uses a binary format, and it needs to write to a file, but you can get around that with StringIO
:
a = # any NumPy array
memfile = StringIO.StringIO()
numpy.save(memfile, a)
memfile.seek(0)
serialized = json.dumps(memfile.read().decode('latin-1'))
# latin-1 maps byte n to unicode code point n
并反序列化:
memfile = StringIO.StringIO()
memfile.write(json.loads(serialized).encode('latin-1'))
memfile.seek(0)
a = numpy.load(memfile)
这篇关于如何在保留矩阵尺寸的同时序列化一个numpy数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!