pandas 将dataframe转换为3d数据 [英] pandas pivot dataframe to 3d data
问题描述
似乎有很多可能性可以将平面数据转换为3d数组,但是我不知何故找不到一个可行的方法:假设我有一些column = ['name','type','date'的数据', '价值'].当我尝试通过
There seem to be a lot of possibilities to pivot flat table data into a 3d array but I'm somehow not finding one that works: Suppose I have some data with columns=['name', 'type', 'date', 'value']. When I try to pivot via
pivot(index='name', columns=['type', 'date'], values='value')
我知道
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
我可能正在阅读开发熊猫的文档吗?似乎这就是那里描述的用法.我正在运行0.8只熊猫.
Am I reading docs from dev pandas maybe? It seems like this is the usage described there. I am running 0.8 pandas.
我想,我想知道我是否有MultiIndex ['x','y','z']系列,是否有熊猫方式将其放在面板中?我可以使用groupby来完成工作,但是然后这几乎就像我在numpy中组装一个n-d数组一样.似乎是一个相当通用的操作,所以我想它可能已经实现了.
I guess, I'm wondering if I have a MultiIndex ['x', 'y', 'z'] Series, is there a pandas way to put that in a panel? I can use groupby and get the job done, but then this is almost like what I would do in numpy to assemble an n-d array. Seems like a fairly generic operation so I would imagine it might be implemented already.
推荐答案
pivot
仅支持使用单个列来生成您的列.您可能想使用 pivot_table
生成使用多列的数据透视表,例如
pivot
only supports using a single column to generate your columns. You probably want to use pivot_table
to generate a pivot table using multiple columns e.g.
pandas.tools.pivot.pivot_table(your_dataframe, values='value', index='name', columns=['type', 'date'], aggfunc='sum')
API参考中提到的层次结构列和pivot
的文档与以下情况有关:具有多个 value 字段,而不是多个 categories .
The hierarchical columns that are mentioned in the API reference and documentation for pivot
relates to cases where you have multiple value fields rather than multiple categories.
假设类型"和日期"是类别,应将其值用作列名,然后应使用pivot_table
.
Assuming 'type' and 'date' are categories, whose values should be used as the column names, then you should use pivot_table
.
但是,如果要为同一类别(例如'type')的不同值字段使用单独的列,则应使用pivot
而不将值列和类别指定为columns参数.
However, if you want separate columns for different value fields for the same category (e.g. 'type'), then you should use pivot
without specifying the value column and your category as the columns parameter.
例如,假设您有此DataFrame:
For example, suppose you have this DataFrame:
df = DataFrame({'name': ['A', 'B', 'A', 'B'], 'type': [1, 1, 2, 2], 'date': ['2012-01-01', '2012-01-01', '2012-02-01', '2012-02-01'], 'value': [1, 2, 3, 4]})
pt = df.pivot_table(values='value', index='name', columns=['type', 'date'])
p = df.pivot('name', 'type')
pt将是:
type 1 2
date 2012-01-01 2012-02-01
name
A 1 3
B 2 4
p将为:
date value
type 1 2 1 2
name
A 2012-01-01 2012-02-01 1 3
B 2012-01-01 2012-02-01 2 4
注意:对于熊猫版本<在0.14.0中,应分别将index
和columns
关键字参数替换为rows
和cols
.
NOTE: For pandas version < 0.14.0, the index
and columns
keyword arguments should be replaced with rows
and cols
respecively.
这篇关于 pandas 将dataframe转换为3d数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!