删除大 pandas 中的一部分/上移一部分?对齐列标题 [英] delete part of a row in pandas / shift up part of a row ? Align Column Headings
问题描述
所以我有一个数据框,其中我想要的标题当前未对齐:
So I have a data frame where the headings I want do not currently line up:
In [1]: df = pd.read_excel('example.xlsx')
print (df.head(10))
Out [1]: Portfolio Asset Country Quantity
Unique Identifier Number of fund B24 B65 B35 B44
456 2 General Type A UNITED KINGDOM 1
123 3 General Type B US 2
789 2 General Type C UNITED KINGDOM 4
4852 4 General Type C UNITED KINGDOM 4
654 1 General Type A FRANCE 3
987 5 General Type B UNITED KINGDOM 2
321 1 General Type B GERMANY 1
951 3 General Type A UNITED KINGDOM 2
357 4 General Type C UNITED KINGDOM 3
我们可以看到;在前2个列标题上方有2个空白单元格,在接下来的4个列标题下方是我不在乎的"B"数字.
As we can see; above the first 2 column headings there are 2 blank cells and below the next 4 column headings are "B" numbers which I don't care about.
所以有2个问题;我该如何在没有列标题的情况下向上移动前两列(由于上面的空白单元格)?
So 2 questions; How can I shift up the first 2 columns without having a column heading to identify them with (due to the blank cells above)?
该如何删除其余列的第2行,并使下面的数据向上移动以代替"B"数字?
And how can I delete just Row 2 of the remaining columns and have the data below move up to take the place of the "B" numbers?
我发现已经问了一些类似的问题 python:shift列在熊猫数据帧中的位置增加了一个,但没有什么能解决上面我没有想到的特殊复杂性.
I found some similar questions already asked python: shift column in pandas dataframe up by one but nothing that solves the particular intricacies above I don't think.
我对Python和Pandas还是很陌生,所以如果这真的很基础,我深表歉意!
Also I'm quite new to Python and Pandas so if this is really basic I apologise!
推荐答案
IIUC,您可以使用:
IIUC you can use:
#create df from multiindex in columns
df1 = pd.DataFrame([x for x in df.columns.values])
print df1
0 1
0 Unique Identifier
1 Number of fund
2 Portfolio B24
3 Asset B65
4 Country B35
5 Quantity B44
#if len of string < 4, give value from column 0 to column 1
df1.loc[df1.iloc[:,1].str.len() < 4, 1] = df1.iloc[:,0]
print df1
0 1
0 Unique Identifier
1 Number of fund
2 Portfolio Portfolio
3 Asset Asset
4 Country Country
5 Quantity Quantity
#set columns by first columns of df1
df.columns = df1.iloc[:,1]
print df
0 Unique Identifier Number of fund Portfolio Asset Country \
0 456 2 General Type A UNITED KINGDOM
1 123 3 General Type B US
2 789 2 General Type C UNITED KINGDOM
3 4852 4 General Type C UNITED KINGDOM
4 654 1 General Type A FRANCE
5 987 5 General Type B UNITED KINGDOM
6 321 1 General Type B GERMANY
7 951 3 General Type A UNITED KINGDOM
8 357 4 General Type C UNITED KINGDOM
0 Quantity
0 1
1 2
2 4
3 4
4 3
5 2
6 1
7 2
8 3
通过评论
print df.columns
Index([u'Portfolio', u'Asset', u'Country', u'Quantity'], dtype='object')
#set first row by columns names
df.iloc[0,:] = df.columns
#reset_index
df = df.reset_index()
#set columns from first row
df.columns = df.iloc[0,:]
df.columns.name= None
#remove first row
print df.iloc[1:,:]
Unique Identifier Number of fund Portfolio Asset Country Quantity
1 456 2 General Type A UNITED KINGDOM 1
2 123 3 General Type B US 2
3 789 2 General Type C UNITED KINGDOM 4
4 4852 4 General Type C UNITED KINGDOM 4
5 654 1 General Type A FRANCE 3
6 987 5 General Type B UNITED KINGDOM 2
7 321 1 General Type B GERMANY 1
8 951 3 General Type A UNITED KINGDOM 2
9 357 4 General Type C UNITED KINGDOM 3
这篇关于删除大 pandas 中的一部分/上移一部分?对齐列标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!