大 pandas 可以根据名称中的模式拆分/合并列吗? [英] Can pandas split/merge columns based on patterns in their name?

查看:148
本文介绍了大 pandas 可以根据名称中的模式拆分/合并列吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

pandas可以根据列名称中的模式拆分和/或合并列吗?这是一个 DataFrame

  meas1_left meas1_right meas2_left meas2_right 
0 1 2 3 4
1 6 7 8 9

我想转过上面数据和这个(我真的不在乎新框架如何索引):

  meas1 meas2 side 
0 1 3左
1 2 4右
2 6 8左
3 7 9右


解决方案

您可以先通过 Multiindex org / pandas-docs / stable / generated / pandas.Series.str.split.htmlrel =nofollow> split

  df.columns = df.columns.str.split('_',expand = True)
print(df)
meas1 meas2
左侧右侧
0 1 2 3 4
1 6 7 8 9

然后 stack 它:

  print(df.stack()。reset_index(level = 0,drop = True).reset_index())
index meas1 meas2
0 left 1 3
1右2 4
2左6 8
3右7 9

如果需要重命名列索引并更改列的顺序:

  print(df.stack()
.reset_index(level = 0,drop = True)
.reset_index()
.rename(columns = {'index':'side' })[['meas1','meas2','side']])

meas1 meas2 side
0 1 3 left
1 2 4 right
2 6 8左
3 7 9右

编辑: str 方法与 index 是从<一个href =http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#string-methods-enhancements =nofollow> 0.16.1 ,如果使用旧版本,请尝试:

  a = df.columns.to_series()。str.split('_')。apply(pd.Series)
tuples = list(zip(a.iloc [:,0],a.iloc [:,1]))
print(tuples)
[('meas1','left' ,('meas1','right'),('meas2','left'),('meas2','right')]

df.columns = pd.MultiIndex.from_tuples(元组)
打印(df)
meas1 meas2
左侧右侧
0 1 2 3 4
1 6 7 8 9


Can pandas split and/or merge columns, based on patterns in the column name? Here's a DataFrame:

    meas1_left  meas1_right  meas2_left  meas2_right
0            1            2           3            4
1            6            7           8            9

I'd like to turn the above data and this (I don't really care how the new frame is indexed):

    meas1  meas2  side
0       1      3  left
1       2      4  right
2       6      8  left
3       7      9  right

解决方案

You can first create Multiindex from columns by split:

df.columns = df.columns.str.split('_', expand=True)
print (df)
  meas1       meas2      
   left right  left right
0     1     2     3     4
1     6     7     8     9

Then stack it:

print (df.stack().reset_index(level=0, drop=True).reset_index())
   index  meas1  meas2
0   left      1      3
1  right      2      4
2   left      6      8
3  right      7      9

And if need rename column index and change order of columns:

print (df.stack()
         .reset_index(level=0, drop=True)
         .reset_index()
         .rename(columns={'index':'side'})[['meas1','meas2','side']])

   meas1  meas2   side
0      1      3   left
1      2      4  right
2      6      8   left
3      7      9  right

EDIT: str methods with index are implemented from 0.16.1, if use older version try:

a = df.columns.to_series().str.split('_').apply(pd.Series)
tuples = list(zip(a.iloc[:,0], a.iloc[:,1]))
print (tuples)
[('meas1', 'left'), ('meas1', 'right'), ('meas2', 'left'), ('meas2', 'right')]

df.columns = pd.MultiIndex.from_tuples(tuples)
print (df)
  meas1       meas2      
   left right  left right
0     1     2     3     4
1     6     7     8     9

这篇关于大 pandas 可以根据名称中的模式拆分/合并列吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆