Python Pandas:如何将一行移到数据框的第一行? [英] Python Pandas: How to move one row to the first row of a Dataframe?

查看:210
本文介绍了Python Pandas:如何将一行移到数据框的第一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出已建立索引的现有数据框.

Given an existing Dataframe that is indexed.

>>> df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])
>>> df
          a         b         c         d         e
0 -0.131666 -0.315019  0.306728 -0.642224 -0.294562
1  0.769310 -1.277065  0.735549 -0.900214 -1.826320
2 -1.561325 -0.155571  0.544697  0.275880 -0.451564
3  0.612561 -0.540457  2.390871 -2.699741  0.534807
4 -1.504476 -2.113726  0.785208 -1.037256 -0.292959
5  0.467429  1.327839 -1.666649  1.144189  0.322896
6 -0.306556  1.668364  0.036508  0.596452  0.066755
7 -1.689779  1.469891 -0.068087 -1.113231  0.382235
8  0.028250 -2.145618  0.555973 -0.473131 -0.638056
9  0.633408 -0.791857  0.933033  1.485575 -0.021429
>>> df.set_index("a")
                  b         c         d         e
a                                                
-0.131666 -0.315019  0.306728 -0.642224 -0.294562
 0.769310 -1.277065  0.735549 -0.900214 -1.826320
-1.561325 -0.155571  0.544697  0.275880 -0.451564
 0.612561 -0.540457  2.390871 -2.699741  0.534807
-1.504476 -2.113726  0.785208 -1.037256 -0.292959
 0.467429  1.327839 -1.666649  1.144189  0.322896
-0.306556  1.668364  0.036508  0.596452  0.066755
-1.689779  1.469891 -0.068087 -1.113231  0.382235
 0.028250 -2.145618  0.555973 -0.473131 -0.638056
 0.633408 -0.791857  0.933033  1.485575 -0.021429

如何将第三行移至第一行?

How to move the 3rd row to the first row?

也就是说,预期结果:

                  b         c         d         e
a                                                
-1.561325 -0.155571  0.544697  0.275880 -0.451564
-0.131666 -0.315019  0.306728 -0.642224 -0.294562
 0.769310 -1.277065  0.735549 -0.900214 -1.826320
 0.612561 -0.540457  2.390871 -2.699741  0.534807
-1.504476 -2.113726  0.785208 -1.037256 -0.292959
 0.467429  1.327839 -1.666649  1.144189  0.322896
-0.306556  1.668364  0.036508  0.596452  0.066755
-1.689779  1.469891 -0.068087 -1.113231  0.382235
 0.028250 -2.145618  0.555973 -0.473131 -0.638056
 0.633408 -0.791857  0.933033  1.485575 -0.021429

现在,原来的第一行应成为第二行.

Now the original first row should become the second row.

推荐答案

重新索引可能是将行以任何新的顺序放在一个明显的步骤中的最佳解决方案,除非它可能需要生成一个可能过大的新DataFrame.

Reindexing is probably the optimal solution for putting the rows in any new order in 1 apparent step, except it may require producing a new DataFrame which could be prohibitively large.

例如

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')
t
Out[81]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

t.index
Out[82]: Int64Index([0, 1, 2, 3], dtype='int64')

t2 = t.reindex([2,0,1,3]) # cannot do this in place
t2
Out[93]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

现在可以将索引重新设置为range(4),而无需重新编制索引:

Now the index can be set back to range(4) without reindexing:

t2.index=range(4)
Out[102]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

这也可以通过元组切换"和行选择作为基本机制来完成,而无需创建新的DataFrame.例如:

It can also be done with 'tuple switching' and row selection as a basic mechanism and without creating a new DataFrame. For example:

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')

t.ix[1], t.ix[2] = t.ix[2], t.ix[1]
t.ix[0], t.ix[1] = t.ix[1], t.ix[0]  
t
Out[96]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

另一种就地方法将DataFrame索引设置为所需的顺序,以便例如第3行获得索引0,依此类推,然后对DataFrame进行就地排序.它封装在以下函数中,该函数假定对行进行了一定范围(m)的正整数m索引,并且对DataFrame进行了简单索引(无MultiIndex),如问题中提供的示例一样.

Another in place method sets the DataFrame index for the desired ordering so that, for example, the 3rd row gets index 0, etc. and then the DataFrame is sorted in place. It's encapsulated in the following function that assumes the rows are indexed with some range(m) for positive integer m and the DataFrame is simply indexed (no MultiIndex) as in the example provided in the question.

def putfirst(n,df):
    if not isinstance(n, int):
        print 'error: 1st arg must be an int'
        return
    if n < 1:
        print 'error: 1st arg must be an int > 0'
        return
    if n == 1:
       print 'nothing to do when first arg == 1'
       return
    if n > len(df):
       print 'error: n exceeds the number of rows in the DataFrame'
       return
    df.index = range(1,n) + [0] + range(n,df.index[-1]+1)
    df.sort(inplace=True)

putfirst的参数为n,这是要重定位到第一行位置的行的序数位置,因此,如果要重定位第三行,则n = 3;否则,n = 3. df是包含要重定位的行的DataFrame.

The arguments of putfirst are n, which is the ordinal position of the row to relocate to the first row position, so that if the 3rd row is to be so relocated then n = 3; and df is the DataFrame containing the row to be relocated.

这是一个演示:

import pandas as pd

df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])

df.set_index("a") # ineffective without assignment or inplace=True
Out[182]: 
                  b         c         d         e
a                                                
 1.394072 -1.076742 -0.192466 -0.871188  0.420852
-1.211411 -0.258867 -0.581647 -1.260421  0.464575
-1.070241  0.804223 -0.156736  2.010390 -0.887104
-0.977936 -0.267217  0.483338 -0.400333  0.449880
 0.399594 -0.151575 -2.557934  0.160807  0.076525
-0.297204 -1.294274 -0.885180 -0.187497 -0.493560
-0.115413 -0.350745  0.044697 -0.897756  0.890874
-1.151185 -2.612303  1.141250 -0.867136  0.383583
-0.437030  0.347489 -1.230179  0.571078  0.060061
-0.225524  1.349726  1.350300 -0.386653  0.865990

df
Out[183]: 
          a         b         c         d         e
0  1.394072 -1.076742 -0.192466 -0.871188  0.420852
1 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
2 -1.070241  0.804223 -0.156736  2.010390 -0.887104
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990

df.index
Out[184]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

putfirst(3,df)
df
Out[186]: 
          a         b         c         d         e
0 -1.070241  0.804223 -0.156736  2.010390 -0.887104
1  1.394072 -1.076742 -0.192466 -0.871188  0.420852
2 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990

这篇关于Python Pandas:如何将一行移到数据框的第一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆