在Pandas中转换DataFrame以输出到CSV [英] Pivoting a DataFrame in Pandas for output to CSV

查看:1328
本文介绍了在Pandas中转换DataFrame以输出到CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个简单的问题,哪些答案在网上很难找到。这是情况:

This is a simple question for which answer are surprisingly difficult to find online. Here's the situation:

>>> A
[('hey', 'you', 4), ('hey', 'not you', 5), ('not hey', 'you', 2), ('not hey', 'not you', 6)]
>>> A_p = pandas.DataFrame(A)
>>> A_p
         0        1  2
0      hey      you  4
1      hey  not you  5
2  not hey      you  2
3  not hey  not you  6
>>> B_p = A_p.pivot(0,1,2)
>>> B_p
1        not you  you
0                    
hey            5    4
not hey        6    2

这不是在 pivot 的文档中建议的 - 在那里,它显示没有左上角1和0的结果手角这就是我正在寻找的,一个DataFrame对象打印为

This isn't quite what's suggested in the documentation for pivot -- there, it shows results without the 1 and 0 in the upper-left-hand corner. And that's what I'm looking for, a DataFrame object that prints as

         not you  you
hey            5    4
not hey        6    2

问题是,正常的行为会导致一个csv文件,其第一个一行是

The problem is that the normal behavior results in a csv file whose first line is

0,not you,you

当我真的想要

not you, you

当正常的csv文件(前面的0)读入R时,它没有正确设置列和行来自框架对象的名称,导致手动操作的痛苦,使其以正确的格式获得。有没有办法得到枢轴给我一个DataFrame对象,没有附加信息在左上角?

When the normal csv file (with the preceding "0,") reads into R, it doesn't properly set the column and row names from the frame object, resulting in painful manual manipulation to get it in the right format. Is there a way to get pivot to give me a DataFrame object without that additional information in the upper-left corner?

推荐答案

你有:

In [17]: B_p.to_csv(sys.stdout)
0,not you,you
hey,5.0,4.0
not hey,6.0,2.0

In [18]: B_p.to_csv(sys.stdout, index=False)
not you,you
5.0,4.0
6.0,2.0

但我认为你想要行名。将索引名称设置为无( B_p.index.name =无)提供一个前导逗号:

But I assume you want the row names. Setting the index name to None (B_p.index.name = None) gives a leading comma:

In [20]: B_p.to_csv(sys.stdout)
,not you,you
hey,5.0,4.0
not hey,6.0,2.0

这粗略地匹配(忽略引用的字符串)R写入 write.csv when row.names = TRUE

This roughly matches (ignoring quoted strings) what R writes in write.csv when row.names=TRUE:

"","a","b"
"foo",0.720538259472741,-0.848304940318957
"bar",-0.64266667412325,-0.442441171401282
"baz",-0.419181615269841,-0.658545964124229
"qux",0.881124313748992,0.36383198969179
"bar2",-1.35613767310069,-0.124014006180608

任何这些帮助?

编辑:今天添加了 index_label = False 选项: p>

Added the index_label=False option today which does what you want:

In [2]: df
Out[2]: 
       A  B
one    1  4
two    2  5
three  3  6

In [3]: df.to_csv('foo.csv', index_
index_exp     index_label=  index_name=   

In [3]: df.to_csv('foo.csv', index_name=False)

In [4]: 
11:24 ~/code/pandas  (master)$ R

R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

re> read.csv('foo.csv')
      A B
one   1 4
two   2 5
three 3 6

这篇关于在Pandas中转换DataFrame以输出到CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆