如何保存字符串numpy的阵列(用逗号)到CSV? [英] How to save numpy array of Strings (with commas) to CSV?

查看:1816
本文介绍了如何保存字符串numpy的阵列(用逗号)到CSV?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL;博士回答:不使用numpy的。使用 csv.writer 而不是 numpy.savetxt

我是新来的Python和numpy的。现在看来似乎不应该如此难以挽救字符串的二维数组(包含逗号)为CSV文件,但我不能得到它的工作就是我想要的。

比方说,我有一个数组,看起来像这样(从列出的清单制造):

  ['文本1,文本2','文字3'],
['文本4','text5']]

我想一个CSV文件看起来像这样(或没有引号)在Excel(管=细胞分离机):

 '文本1,文本2| 文字3
文本4| text5

我用 numpy.savetxt(文件名,数组,FMT =%S),我得到以下CSV输出(方括号):

  ['文本1,文本2','文字3']
['文本4','text5']

该款显示器在Excel这样的

  ['文本1 |文本2| 文字3']
['文本4| text5']

我试图与savetxt分隔符参数大惊小怪,但在输出没有变化。

我需要手动执行此操作?如果是这样,让我知道,如果有任何的快捷键我应该知道的。

最后,我需要到CSV导入到PostgreSQL数据库。我不是正是CSV格式需要为这达到预期效果完全清楚,但我猜想,如果它看起来错在Excel中,它可能会最终在Postgres的混乱。该的Postgres文档说:


  

在每个记录的值由分隔符分开。如果
  值包含分隔符,引号字符时,
  NULL字符串,回车或换行符,那么整个
  值是由引号字符pfixed和后缀$ P $,任何
  一个引号字符或ESCAPE值的范围内发生
  字符由转义字符pceded $ P $。您还可以使用
  FORCE_QUOTE强制在输出非NULL值时报价
  特定列。


谢谢!

++++++++++++++++++++++++++++

实时输入和输出,在情况下,它贴切的不同:

数组:

  ['8908232','植物生长箱基金在植物,威斯康星大学麦迪逊分校的系','DBI','INSTRUMENTAT&安培;仪器DEVP,90年1月1日','12 / 19/89','WI','标准拨款,乔安P. Roskoski','12 / 31/91','$ 94,914.00','BIO ','1108','','$ 0.00包装']

CSV输出:

  ['8908232','在植物,威斯康星大学麦迪逊分校的系植物生长箱基金','DBI','INSTRUMENTAT&安培;仪器DEVP,90年1月1日','12 / 19/89','WI','标准拨款,乔安P. Roskoski','12 / 31/91','$ 94,914.00','BIO ','1108','','$ 0.00包装']

Excel的版本:

  ['8908232'出现在威斯康星 - 麦迪逊'DBI'INSTRUMENTAT和放大器的植物学大学系植物生长箱设施;仪器DEVP90年1月1日'12 / 19/89''WI''标准拨款乔安P. Roskoski''12 / 31/91''$ 94 914.00''生物''1108'''' $ 0.00包装']


解决方案

添加 FMT =%S不将围绕每个引号现场引号都是兼职Python的字符串常量字符串%S %s的只是说,任何值应被格式化为一个串。如果要强制周围的一切报价,你需要有引号的格式字符串的,如 FMT =%S'

然而,即使你不这样做,你显示该行不可能产生你显示输出。有没有办法,numpy的是改变你的逗号进入管道符,或者使用管道符作为分隔符。唯一你可以得到是加入分隔符='| 。而如果添加了...它的工作原理没有变化,你会得到这样的:

 文本1,文本2 |文字3
文本4 | text5

所以,无论你的实际的问题是,它不能是你所描述的。


同时,如果你试图尽可能灵活地写非数字数据的CSV文件,该标准库的 CSV 模块比numpy的强大得多。的优势numpy的,顾名思义,就是在处理的数值的数据。以下是如何用做 CSV

 开放(文件名,世行)为f:
    csv.writer(F).writerows(阵列)

这将默认为作为分隔符。由于一些你的字符串有人物在其中,默认情况下,它会引用这些字符串。但是你可以配置报价/逃逸行为,引号字符,分隔符,以及各种其他的事情,numpy的不能。

tl;dr ANSWER: Don't use numpy. Use csv.writer instead of numpy.savetxt.

I'm new to Python and NumPy. It seems like it shouldn't be so difficult to save a 2D array of strings (that contain commas) to a CSV file, but I can't get it to work the way I want.

Let's say I have an array that looks like this (made from a list of lists):

[['text1, text2', 'text3'],
['text4', 'text5']]

I want a CSV file that looks like this (or without quote characters) in Excel (pipe = cell separator):

'text1, text2' | 'text3'
'text4'        | 'text5'

I'm using numpy.savetxt(filename, array, fmt="%s"), and I get the following CSV output (with square brackets):

['text1, text2','text3']
['text4','text5']

Which displays in Excel like this:

['text1  | text2' | 'text3']
['text4' | 'text5']

I tried fussing with the savetxt delimiter argument, but no change in output.

Do I need to do this manually? If so, let me know if there are any shortcuts I should be aware of.

Ultimately, I need to import the CSV into a Postgresql database. I'm not completely clear on exactly what the CSV formatting needs to be for this to work as expected, but I'm assuming if it looks wrong in Excel, it will probably end up messed up in Postgres. The Postgres documentation says:

The values in each record are separated by the DELIMITER character. If the value contains the delimiter character, the QUOTE character, the NULL string, a carriage return, or line feed character, then the whole value is prefixed and suffixed by the QUOTE character, and any occurrence within the value of a QUOTE character or the ESCAPE character is preceded by the escape character. You can also use FORCE_QUOTE to force quotes when outputting non-NULL values in specific columns.

Thanks!

++++++++++++++++++++++++++++

Real input and output, in case it's relevantly different:

array:

[['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']]

CSV output:

['8908232', 'Plant Growth Chamber Facility at the Department of Botany, University of Wisconsin-Madison', 'DBI', 'INSTRUMENTAT & INSTRUMENT DEVP', '1/1/90', '12/19/89', 'WI', 'Standard Grant', 'Joann P. Roskoski', '12/31/91', '$94,914.00 ', 'BIO', '1108', '', '$0.00 ']

Excel's version:

['8908232'   'Plant Growth Chamber Facility at the Department of Botany  University of Wisconsin-Madison'    'DBI'   'INSTRUMENTAT & INSTRUMENT DEVP'    '1/1/90'    '12/19/89'  'WI'    'Standard Grant'    'Joann P. Roskoski'     '12/31/91'  '$94   914.00 '     'BIO'   '1108'  ''  '$0.00 ']                  

解决方案

Adding fmt="%s" doesn't put quotes around each field—the quotes are part of the Python string literal for the string %s, and %s just says that any value should be formatted as a string. If you want to force quotes around everything, you need to have quotes in the format string, like fmt='"%s"'.

However, even if you don't do that, the line you showed can't possibly produce the output you showed. There is no way that NumPy is changing your commas into pipe characters, or using pipe characters as delimiters. The only you can get that is by adding delimiter=' |'. And if you add that… it works with no changes, and you get this:

text1, text2 | text3
text4 | text5

So whatever your actual problem is, it can't be the one you described.


Meanwhile, if you're trying to write CSV files for non-numeric data as flexibly as possible, the standard library's csv module is much more powerful than NumPy. The advantage of NumPy—as the name implies—is in dealing with numeric data. Here's how to do it with csv:

with open(filename, 'wb') as f:
    csv.writer(f).writerows(array)

This will default to , as a delimiter. Since some of your strings have , characters in them, by default, it will quote those strings. But you can configure the quoting/escaping behavior, the quote character, the delimiter, and all kinds of other things that NumPy can't.

这篇关于如何保存字符串numpy的阵列(用逗号)到CSV?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆