如何从字符串读取numpy二维数组? [英] how to read numpy 2D array from string?
问题描述
如何从字符串读取numpy数组?像这样的字符串:
how can I read a numpy array from a string? take a string like:
[[ 0.5544 0.4456], [ 0.8811 0.1189]]
并将其转换为数组:
a = from_string("[[ 0.5544 0.4456], [ 0.8811 0.1189]]")
其中a
成为对象:np.array([[0.5544, 0.4456], [0.8811, 0.1189]])
更新:
我正在寻找一个非常简单的界面.将2D数组(浮点数)转换为字符串,然后将其读回以重建数组的方法:
i'm looking for a very simple interface. a way to convert 2D arrays (of floats) to a string and then a way to read them back to reconstruct the array:
arr_to_string(array([[0.5544, 0.4456], [0.8811, 0.1189]]))
应该返回"[[ 0.5544 0.4456], [ 0.8811 0.1189]]"
string_to_arr("[[ 0.5544 0.4456], [ 0.8811 0.1189]]")
应该返回对象array([[0.5544, 0.4456], [0.8811, 0.1189]])
理想情况下,如果arr_to_string
具有一个精度参数来控制转换为字符串的浮点精度,那将很好,这样您就不会得到像0.4444444999999999999999999
这样的条目.
ideally it would be great if arr_to_string
had a precision parameter that controlled the precision of floating points converted to strings, so that you wouldn't get entries like 0.4444444999999999999999999
.
在numpy的文档中我找不到任何可以同时执行这两种操作的文档. np.save
允许您创建一个字符串,但是无法将其加载回(np.load
仅适用于文件.)
there's nothing i can find in numpy docs that does this both ways. np.save
lets you make a string but then there's no way to load it back in (np.load
only works for files.)
推荐答案
挑战在于不仅要保存数据缓冲区,还要保存shape和dtype. np.fromstring
读取数据缓冲区,但作为一维数组;您必须从其他位置获取dtype和形状.
The challenge is to save not only the data buffer, but also the shape and dtype. np.fromstring
reads the data buffer, but as a 1d array; you have to get the dtype and shape from else where.
In [184]: a=np.arange(12).reshape(3,4)
In [185]: np.fromstring(a.tostring(),int)
Out[185]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [186]: np.fromstring(a.tostring(),a.dtype).reshape(a.shape)
Out[186]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
保存Python对象的一种受人尊敬的机制是pickle
,而numpy
是pickle兼容的:
A time honored mechanism to save Python objects is pickle
, and numpy
is pickle compliant:
In [169]: import pickle
In [170]: a=np.arange(12).reshape(3,4)
In [171]: s=pickle.dumps(a*2)
In [172]: s
Out[172]: "cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I3\nI4\ntp6\ncnumpy\ndtype\np7\n(S'i4'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x06\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\n\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x0e\\x00\\x00\\x00\\x10\\x00\\x00\\x00\\x12\\x00\\x00\\x00\\x14\\x00\\x00\\x00\\x16\\x00\\x00\\x00'\np13\ntp14\nb."
In [173]: pickle.loads(s)
Out[173]:
array([[ 0, 2, 4, 6],
[ 8, 10, 12, 14],
[16, 18, 20, 22]])
有一个numpy函数可以读取泡菜字符串:
There's a numpy function that can read the pickle string:
In [181]: np.loads(s)
Out[181]:
array([[ 0, 2, 4, 6],
[ 8, 10, 12, 14],
[16, 18, 20, 22]])
您在字符串中提到了np.save
,但是您不能使用np.load
.一种解决方法是进一步进入代码,并使用np.lib.npyio.format
.
You mentioned np.save
to a string, but that you can't use np.load
. A way around that is to step further into the code, and use np.lib.npyio.format
.
In [174]: import StringIO
In [175]: S=StringIO.StringIO() # a file like string buffer
In [176]: np.lib.npyio.format.write_array(S,a*3.3)
In [177]: S.seek(0) # rewind the string
In [178]: np.lib.npyio.format.read_array(S)
Out[178]:
array([[ 0. , 3.3, 6.6, 9.9],
[ 13.2, 16.5, 19.8, 23.1],
[ 26.4, 29.7, 33. , 36.3]])
save
字符串的标头带有dtype
和shape
信息:
The save
string has a header with dtype
and shape
info:
In [179]: S.seek(0)
In [180]: S.readlines()
Out[180]:
["\x93NUMPY\x01\x00F\x00{'descr': '<f8', 'fortran_order': False, 'shape': (3, 4), } \n",
'\x00\x00\x00\x00\x00\x00\x00\x00ffffff\n',
'@ffffff\x1a@\xcc\xcc\xcc\xcc\xcc\xcc#@ffffff*@\x00\x00\x00\x00\x00\x800@\xcc\xcc\xcc\xcc\xcc\xcc3@\x99\x99\x99\x99\x99\x197@ffffff:@33333\xb3=@\x00\x00\x00\x00\x00\x80@@fffff&B@']
如果您想要人类可读的字符串,则可以尝试json
.
If you want a human readable string, you might try json
.
In [196]: import json
In [197]: js=json.dumps(a.tolist())
In [198]: js
Out[198]: '[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]'
In [199]: np.array(json.loads(js))
Out[199]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
json
最明显的用途是去/从数组的列表表示形式.可能有人写了一个更复杂的数组json
表示形式.
Going to/from the list representation of the array is the most obvious use of json
. Someone may have written a more elaborate json
representation of arrays.
您也可以采用csv
格式-读/写csv数组还有很多问题.
You could also go the csv
format route - there have been lots of questions about reading/writing csv arrays.
'[[ 0.5544 0.4456], [ 0.8811 0.1189]]'
是用于此目的的较差的字符串表示形式.它看起来确实很像数组的str()
,但是用,
而不是\n
.但是,没有一种解析嵌套[]
的干净方法,缺少分隔符是很痛苦的.如果它始终使用,
,则json
可以将其转换为列表.
is a poor string representation for this purpose. It does look a lot like the str()
of an array, but with ,
instead of \n
. But there isn't a clean way of parsing the nested []
, and the missing delimiter is a pain. If it consistently uses ,
then json
can convert it to list.
np.matrix
接受类似字符串的MATLAB:
np.matrix
accepts a MATLAB like string:
In [207]: np.matrix(' 0.5544, 0.4456;0.8811, 0.1189')
Out[207]:
matrix([[ 0.5544, 0.4456],
[ 0.8811, 0.1189]])
In [208]: str(np.matrix(' 0.5544, 0.4456;0.8811, 0.1189'))
Out[208]: '[[ 0.5544 0.4456]\n [ 0.8811 0.1189]]'
这篇关于如何从字符串读取numpy二维数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!