如何做多列from_tuples? [英] How to do Multi-Column from_tuples?

查看:38
本文介绍了如何做多列from_tuples?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解了如何使用pd.MultiIndex.from_tuples()来更改类似内容

I get how to use pd.MultiIndex.from_tuples() in order to change something like

       Value
(A,a)  1
(B,a)  2
(B,b)  3

进入

                Value
Caps Lower      
A    a          1
B    a          2
B    b          3

但是如何更改表格中的列元组

But how do I change column tuples in the form

       (A, a)  (A, b) (B,a)  (B,b)
index
1      1       2      2      3
2      2       3      3      2
3      3       4      4      1

放入表格

 Caps         A              B
 Lower        a       b      a      b
 index
 1            1       2      2      3
 2            2       3      3      2
 3            3       4      4      1

非常感谢.

我有一个元组列标题的原因是,当我将具有单级列的DataFrame加入具有多级列的DataFrame时,它将多列转换为元组字符串格式,并将单个级别保留为单个字符串.

The reason I have a tuple column header is that when I joined a DataFrame with a single level column onto a DataFrame with a Multi-Level column it turned the Multi-Column into a tuple of strings format and left the single level as single string.

编辑2-替代解决方案:如上所述,这里的问题是通过具有不同列级大小的join引起的.这意味着多列"被简化为一个字符串元组.解决这个问题的方法是,在连接之前,我将df.columns = [('col_level_0','col_level_1','col_level_2')]用于希望加入的DataFrame.

Edit 2 - Alternate Solution: As stated the problem here arose via a join with differing column level size. This meant the Multi-Column was reduced to a tuple of strings. The get around this issue, prior to the join I used df.columns = [('col_level_0','col_level_1','col_level_2')] for the DataFrame I wished to join.

推荐答案

使用

Assign direct to columns with the result from pd.MultiIndex.from_tuples passing in your existing columns:

In [186]:
l=[('A', 'a'),  ('A', 'b'), ('B','a'),  ('B','b')]
df = pd.DataFrame(np.random.randn(5,4), columns = l)
df

Out[186]:
     (A, a)    (A, b)    (B, a)    (B, b)
0 -0.876353  0.553742  1.631858 -0.561309
1  0.463058 -0.455014 -0.491336 -1.436059
2  0.337810  0.233624 -0.571749 -2.259763
3  1.073057 -0.475894  0.999643 -0.379743
4  0.441800  0.311202 -0.191552  0.291268

In [187]:    
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])
df

Out[187]:
Caps          A                   B          
Lower         a         b         a         b
0     -0.876353  0.553742  1.631858 -0.561309
1      0.463058 -0.455014 -0.491336 -1.436059
2      0.337810  0.233624 -0.571749 -2.259763
3      1.073057 -0.475894  0.999643 -0.379743
4      0.441800  0.311202 -0.191552  0.291268

请注意,您可以像下面这样直接分配columns属性的names属性:

note that you can assign directly to names attribute of the columns attribute like the following:

df.columns.names = ['Caps','Lower']

不要与name属性混淆

这篇关于如何做多列from_tuples?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆