Python / Numpy - 使用列和行标题保存数组 [英] Python/Numpy - Save Array with Column AND Row Titles

查看:1490
本文介绍了Python / Numpy - 使用列和行标题保存数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将2D数组保存到具有行和列标题信息(如表)的CSV文件。我知道我可以使用头文件参数numpy.savetxt来保存列名,但是有没有任何简单的方法来包括一些其他数组(或列表)作为第一列数据(如行标题)?



下面是我目前如何做的例子。是否有更好的方法来包含这些行标题,也许一些技巧与savetxt我不知道?

  import csv 
import numpy as np

data = np.arange(12).reshape(3,4)
#为第一列添加一个'',因为行标题到那里。 。
cols = ['','col1','col2','col3','col4']
rows = ['row1','row2','row3']

with open('test.csv','wb')as f:
writer = csv.writer(f)
writer.writerow(cols)
for row_title,data_row在zip(行,数据)中:
writer.writerow([row_title] + data_row.tolist())


 解决方案

#row of row titles
rows = np.array(['row1','row2','row3'],dtype ='| S20')[:, np.newaxis]
with open 'test.csv','w')as f:
np.savetxt(f,np.hstack((rows,data)),delimiter =',',fmt ='%s')

这隐含地将 data 转换为字符串数组,



dtype '| S20'表示 20个字符的字符串。如果太低,您的数字将被削减:

 >>> np.asarray([123],dtype ='| S2')
array(['12'],
dtype ='| S2')
/ pre>

另一个选择,从我有限的测试是更慢,但给你更多的控制,没有斩波问题将使用 np.char.mod ,例如

 # = np.array(['row1','row2','row3'])[:, np.newaxis] 
str_data = np.char.mod(%10.6f,data)
('test.csv','w')as f:
np.savetxt(f,np.hstack((rows,str_data)),delimiter =',',fmt ='%s'


I want to save a 2D array to a CSV file with row and column "header" information (like a table). I know that I could use the header argument to numpy.savetxt to save the column names, but is there any easy way to also include some other array (or list) as the first column of data (like row titles)?

Below is an example of how I currently do it. Is there a better way to include those row titles, perhaps some trick with savetxt I'm unaware of?

import csv
import numpy as np

data = np.arange(12).reshape(3,4)
# Add a '' for the first column because the row titles go there...
cols = ['', 'col1', 'col2', 'col3', 'col4']
rows = ['row1', 'row2', 'row3']

with open('test.csv', 'wb') as f:
   writer = csv.writer(f)
   writer.writerow(cols)
   for row_title, data_row in zip(rows, data):
      writer.writerow([row_title] + data_row.tolist())

解决方案

Maybe you'd prefer to do something like this:

# Column of row titles
rows = np.array(['row1', 'row2', 'row3'], dtype='|S20')[:, np.newaxis]
with open('test.csv', 'w') as f:
    np.savetxt(f, np.hstack((rows, data)), delimiter=', ', fmt='%s')

This is implicitly converting data to an array of strings, and takes about 200 ms for every million items in my computer.

The dtype '|S20' means strings of twenty characters. If it's too low, your numbers will get chopped:

>>> np.asarray([123], dtype='|S2')
array(['12'], 
  dtype='|S2')

Another option, that from my limited testing is slower, but gives you a lot more control and doesn't have the chopping issue would be using np.char.mod, like

# Column of row titles
rows = np.array(['row1', 'row2', 'row3'])[:, np.newaxis]
str_data = np.char.mod("%10.6f", data)
with open('test.csv', 'w') as f:
    np.savetxt(f, np.hstack((rows, str_data)), delimiter=', ', fmt='%s')

这篇关于Python / Numpy - 使用列和行标题保存数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆