重塑 pandas 数据框 [英] reshape a pandas dataframe

查看:62
本文介绍了重塑 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设这样的数据框:

df = pd.DataFrame([[1,2,3,4],[5,6,7,8],[9,10,11,12]], columns = ['A', 'B', 'A1', 'B1'])

我想要一个看起来像这样的数据框:

I would like to have a dataframe which looks like:

什么不起作用:

new_rows = int(df.shape[1]/2) * df.shape[0]
new_cols = 2
df.values.reshape(new_rows, new_cols, order='F')

当然我可以遍历数据并创建一个新的列表列表,但是必须有更好的方法.有什么想法吗?

of course I could loop over the data and make a new list of list but there must be a better way. Any ideas ?

推荐答案

pd.wide_to_long函数几乎完全针对这种情况而构建,在这种情况下,您有许多相同的变量前缀以不同的数字后缀结尾.唯一的区别是您的第一组变量没有后缀,因此您需要首先重命名列.

The pd.wide_to_long function is built almost exactly for this situation, where you have many of the same variable prefixes that end in a different digit suffix. The only difference here is that your first set of variables don't have a suffix, so you will need to rename your columns first.

pd.wide_to_long的唯一问题是,它必须具有标识变量i,与melt不同. reset_index用于创建一个唯一标识的列,稍后将其删除.我认为将来可能会纠正.

The only issue with pd.wide_to_long is that it must have an identification variable, i, unlike melt. reset_index is used to create a this uniquely identifying column, which is dropped later. I think this might get corrected in the future.

df1 = df.rename(columns={'A':'A1', 'B':'B1', 'A1':'A2', 'B1':'B2'}).reset_index()
pd.wide_to_long(df1, stubnames=['A', 'B'], i='index', j='id')\
  .reset_index()[['A', 'B', 'id']]

    A   B id
0   1   2  1
1   5   6  1
2   9  10  1
3   3   4  2
4   7   8  2
5  11  12  2

这篇关于重塑 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆