pandas :将DataFrame列值转换为新的Dataframe索引和列 [英] Pandas: Convert DataFrame Column Values Into New Dataframe Indices and Columns
问题描述
我有一个看起来像这样的数据框:
I have a dataframe that looks like this:
a b c
0 1 10
1 2 10
2 2 20
3 3 30
4 1 40
4 3 10
以上数据框为默认(0,1,2,3,4 ...)索引.我想将其转换为如下所示的数据框:
The dataframe above as default (0,1,2,3,4...) indices. I would like to convert it into a dataframe that looks like this:
1 2 3
0 10 0 0
1 0 10 0
2 0 20 0
3 0 0 30
4 40 0 10
在第一个数据帧中的列"a"成为第二个数据帧中的索引的情况下,"b"的值成为列名,并且c的值被复制,其中0或NaN填充缺少的值.原始数据集很大,将导致第二个数据帧非常稀疏.然后,我打算将此数据框添加到一个更大的数据框中,这很简单.
Where column 'a' in the first dataframe becomes the index in the second dataframe, the values of 'b' become the column names and the values of c are copied over, with 0 or NaN filling missing values. The original dataset is large and will result in a very sparse second dataframe. I then intend to add this dataframe to a much larger one, which is straightforward.
有人可以建议实现这一目标的最佳方法吗?
Can anyone advise the best way to achieve this please?
推荐答案
您可以为此使用pivot
方法.
请参阅文档: http://pandas.pydata .org/pandas-docs/stable/reshaping.html#reshaping-by-pivoting-dataframe-objects
一个例子:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a':[0,1,2,3,4,4], 'b':[1,2,2,3,1,3], 'c':[10,10,20,3
0,40,10]})
In [3]: df
Out[3]:
a b c
0 0 1 10
1 1 2 10
2 2 2 20
3 3 3 30
4 4 1 40
5 4 3 10
In [4]: df.pivot(index='a', columns='b', values='c')
Out[4]:
b 1 2 3
a
0 10 NaN NaN
1 NaN 10 NaN
2 NaN 20 NaN
3 NaN NaN 30
4 40 NaN 10
如果您希望零而不是示例中的NaN,则可以使用fillna
:
If you want zeros instead of NaN's as in your example, you can use fillna
:
In [5]: df.pivot(index='a', columns='b', values='c').fillna(0)
Out[5]:
b 1 2 3
a
0 10 0 0
1 0 10 0
2 0 20 0
3 0 0 30
4 40 0 10
这篇关于 pandas :将DataFrame列值转换为新的Dataframe索引和列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!