在 pandas 数据框中的特定索引处插入新行 [英] Inserting new rows in pandas data frame at specific indices

查看:74
本文介绍了在 pandas 数据框中的特定索引处插入新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个以下数据框 df ,其中包含两列标识符",值"和子标识":

I have a following data frame df with two columns "identifier", "values" and "subid":

     identifier   values    subid
0      1          101       1
1      1          102       1
2      1          103       2 #index in list x        
3      1          104       2
4      1          105       2
5      2          106       3   
6      2          107       3
7      2          108       3
8      2          109       4 #index in list x
9      2          110       4
10     3          111       5
11     3          112       5 
12     3          113       6 #index in list x

我有一个索引列表,例如

I have a list of indices, say

x = [2, 8, 12] 

我想在列表x中提到的索引之前插入行.就像,对于在索引2之前插入的行将具有以下值,它将具有与索引2处的行相同的相同的标识符:即1;与索引2的行相同的值,即103;但是新行中的 subid 将是((索引2处的subid)-1),或者仅仅是前一行中的subid,即1.

I want insert rows just before the indices mentioned in the list x. Like, for the row which is inserted just before index 2, will have the following values, it will have the same identifier as the row at index 2, i.e. 1; same values as the row at index 2, i.e. 103; but the subid in the new row would be ((subid at index 2) - 1), or simply the subid from the previous row i.e 1.

以下是我期望的最终结果df:

Below is the final resultant df I expect:

   identifier   values    subid
0      1          101       1
1      1          102       1
2      1          103       1 #new row inserted     
3      1          103       2 #index in list x        
4      1          104       2
5      1          105       2
6      2          106       3   
7      2          107       3
8      2          108       3
9      2          109       3 #new row inserted
10     2          109       4 #index in list x
11     2          110       4
12     3          111       5
13     3          112       5 
14     3          113       5 #new row inserted
15     3          113       6 #index in list x

我一直在尝试的代码:

 m = df.index       #storing the indices of the df
 #m

 for i in m:
     if i in x:     #x is the given list of indices
         df.iloc[i-1]["identifier"] = df.iloc[i]["identifier"]
         df.iloc[i-1]["values"] = df.iloc[i]["values"]
         df.iloc[i-1]["subid"] = (df.iloc[i]["subid"]-1)
 df

上面的代码只是替换位于(i-1)索引处的行,并且不插入具有上述值的其他行.请帮忙.

The above code is simply replacing the rows at (i-1) indices and not inserting the additional rows with the above values. Please help.

请让我知道是否有任何不清楚的地方.

Please let me know if anything is unclear.

推荐答案

保留索引顺序是棘手的部分.我不确定这是否是最有效的方法,但它应该可以工作.

Preserving the index order is the tricky part. I'm not sure this is the most efficient way to do this, but it should work.

x = [2,8,12]
rows = []
cur = {}

for i in df.index:
    if i in x:
        cur['index'] = i
        cur['identifier'] = df.iloc[i].identifier
        cur['values'] = df.iloc[i]['values']
        cur['subid'] = df.iloc[i].subid - 1
        rows.append(cur)
        cur = {}

然后,遍历新行列表,并执行增量连接,将每个新行插入正确的位置.

Then, iterate through the new rows list, and perform an incremental concat, inserting each new row into the correct spot.

offset = 0; #tracks the number of rows already inserted to ensure rows are inserted in the correct position

for d in rows:
    df = pd.concat([df.head(d['index'] + offset), pd.DataFrame([d]), df.tail(len(df) - (d['index']+offset))])
    offset+=1


df.reset_index(inplace=True)
df.drop('index', axis=1, inplace=True)
df

    level_0 identifier  subid   values
0         0          1      1      101
1         1          1      1      102
2         0          1      1      103
3         2          1      2      103
4         3          1      2      104
5         4          1      2      105
6         5          2      3      106
7         6          2      3      107
8         7          2      3      108
9         0          2      3      109
10        8          2      4      109
11        9          2      4      110
12       10          3      5      111
13       11          3      5      112
14        0          3      5      113
15       12          3      6      113

这篇关于在 pandas 数据框中的特定索引处插入新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆