删除大 pandas 中的一部分/上移一部分?对齐列标题 [英] delete part of a row in pandas / shift up part of a row ? Align Column Headings

查看:48
本文介绍了删除大 pandas 中的一部分/上移一部分?对齐列标题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个数据框,其中我想要的标题当前未对齐:

So I have a data frame where the headings I want do not currently line up:

    In [1]: df = pd.read_excel('example.xlsx')
            print (df.head(10))

    Out [1]:                                 Portfolio  Asset        Country   Quantity  
         Unique Identifier Number of fund       B24     B65             B35      B44   
          456               2                General  Type A  UNITED KINGDOM        1   
          123               3                General  Type B              US        2   
          789               2                General  Type C  UNITED KINGDOM        4   
          4852              4                General  Type C  UNITED KINGDOM        4   
          654               1                General  Type A          FRANCE        3   
          987               5                General  Type B  UNITED KINGDOM        2   
          321               1                General  Type B         GERMANY        1   
          951               3                General  Type A  UNITED KINGDOM        2   
          357               4                General  Type C  UNITED KINGDOM        3   

我们可以看到;在前2个列标题上方有2个空白单元格,在接下来的4个列标题下方是我不在乎的"B"数字.

As we can see; above the first 2 column headings there are 2 blank cells and below the next 4 column headings are "B" numbers which I don't care about.

所以有2个问题;我该如何在没有列标题的情况下向上移动前两列(由于上面的空白单元格)?

So 2 questions; How can I shift up the first 2 columns without having a column heading to identify them with (due to the blank cells above)?

该如何删除其余列的第2行,并使下面的数据向上移动以代替"B"数字?

And how can I delete just Row 2 of the remaining columns and have the data below move up to take the place of the "B" numbers?

我发现已经问了一些类似的问题 python:shift列在熊猫数据帧中的位置增加了一个,但没有什么能解决上面我没有想到的特殊复杂性.

I found some similar questions already asked python: shift column in pandas dataframe up by one but nothing that solves the particular intricacies above I don't think.

我对Python和Pandas还是很陌生,所以如果这真的很基础,我深表歉意!

Also I'm quite new to Python and Pandas so if this is really basic I apologise!

推荐答案

IIUC,您可以使用:

IIUC you can use:

#create df from multiindex in columns
df1 = pd.DataFrame([x for x in df.columns.values])
print df1
           0                  1
0             Unique Identifier
1                Number of fund
2  Portfolio                B24
3      Asset                B65
4    Country                B35
5   Quantity                B44

#if len of string < 4, give value from column 0 to column 1
df1.loc[df1.iloc[:,1].str.len() < 4, 1] = df1.iloc[:,0]
print df1
           0                  1
0             Unique Identifier
1                Number of fund
2  Portfolio          Portfolio
3      Asset              Asset
4    Country            Country
5   Quantity           Quantity

#set columns by first columns of df1
df.columns = df1.iloc[:,1]

print df
0  Unique Identifier  Number of fund Portfolio   Asset         Country  \
0                456               2   General  Type A  UNITED KINGDOM   
1                123               3   General  Type B              US   
2                789               2   General  Type C  UNITED KINGDOM   
3               4852               4   General  Type C  UNITED KINGDOM   
4                654               1   General  Type A          FRANCE   
5                987               5   General  Type B  UNITED KINGDOM   
6                321               1   General  Type B         GERMANY   
7                951               3   General  Type A  UNITED KINGDOM   
8                357               4   General  Type C  UNITED KINGDOM   

0  Quantity  
0         1  
1         2  
2         4  
3         4  
4         3  
5         2  
6         1  
7         2  
8         3  

通过评论

print df.columns
Index([u'Portfolio', u'Asset', u'Country', u'Quantity'], dtype='object')

#set first row by columns names
df.iloc[0,:] = df.columns

#reset_index
df = df.reset_index()
#set columns from first row
df.columns = df.iloc[0,:]
df.columns.name= None
#remove first row
print df.iloc[1:,:]
  Unique Identifier Number of fund Portfolio   Asset         Country Quantity
1               456              2   General  Type A  UNITED KINGDOM        1
2               123              3   General  Type B              US        2
3               789              2   General  Type C  UNITED KINGDOM        4
4              4852              4   General  Type C  UNITED KINGDOM        4
5               654              1   General  Type A          FRANCE        3
6               987              5   General  Type B  UNITED KINGDOM        2
7               321              1   General  Type B         GERMANY        1
8               951              3   General  Type A  UNITED KINGDOM        2
9               357              4   General  Type C  UNITED KINGDOM        3

这篇关于删除大 pandas 中的一部分/上移一部分?对齐列标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆