如何将Dataframe单元格中的列表分解成单独的行 [英] How to explode a list inside a Dataframe cell into separate rows

查看:893
本文介绍了如何将Dataframe单元格中的列表分解成单独的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



所以,请执行以下操作:





如果我要解压缩并在nearest_neighbors列,以便每个值都将是每个对手索引中的一行,我最好如何处理这些?有没有熊猫方法是为这样的操作?我只是不知道。



提前感谢,

解决方案

在下面的代码中,我首先将索引重置为



我创建一个列表,其中外部列表​​的每个元素都是目标DataFrame的一行,内部列表的每个元素都是一个的列,这个嵌套列表最终将被连接以创建所需的Da taFrame。



我使用 lambda 函数和列表迭代来创建一个 nearest_neighbors 与相关的名称对手配对。最后,我从此列表中创建了一个新的DataFrame(使用原始的列名称并将索引设置回名称对手)。

  df =(pd.DataFrame({ 'name':['AJ Price'] * 3,
'opponent':['76ers','blazers','bobcats'],
'nearest_neighbors':[['Zach LaVine' 'Jeremy Lin','Nate Robinson','Isaia']] * 3})
.set_index(['name','opponent']))

>> ; df
nearest_neighbors
名称对手
A.J.价格76人[Zach LaVine,林书豪,Nate Robinson,以赛亚]
开拓者[Zach LaVine,林书豪,Nate Robinson,Isaia]
bobcats [Zach LaVine,Jeremy Lin,Nate Robinson,Isaia]

df.reset_index(inplace = True)
rows = []
_ = df.apply(lambda row:[rows.append([row ['name'],row [ 'opponent'],nn])
在row.nearest_neighbors中的nn],轴= 1)
df_new = pd.DataFrame(rows,columns = df.columns).set_index(['name', 'opponent'])

>>> df_new
nearest_neighbors
名称对手
A.J.价格76ers Zach LaVine
76ers Jeremy Lin
76ers Nate Robinson
76ers Isaia
西装Zach LaVine
西装外套Jeremy Lin
西装外套Nate Robinson
西装外套Isaia
bobcats Zach LaVine
bobcats Jeremy Lin
bobcats Nate Robinson
bobcats Isaia


I'm looking to turn a pandas cell containing a list into rows for each of those values.

So, take this:

If I'd like to unpack and stack the values in the 'nearest_neighbors" column so that each value would be a row within each 'opponent' index, how would I best go about this? Are there pandas methods that are meant for operations like this? I'm just not aware.

Thanks in advance, guys.

解决方案

In the code below, I first reset the index to make the row iteration easier.

I create a list of lists where each element of the outer list is a row of the target DataFrame and each element of the inner list is one of the columns. This nested list will ultimately be concatenated to create the desired DataFrame.

I use a lambda function together with list iteration to create a row for each element of the nearest_neighbors paired with the relevant name and opponent.

Finally, I create a new DataFrame from this list (using the original column names and setting the index back to name and opponent).

df = (pd.DataFrame({'name': ['A.J. Price'] * 3, 
                    'opponent': ['76ers', 'blazers', 'bobcats'], 
                    'nearest_neighbors': [['Zach LaVine', 'Jeremy Lin', 'Nate Robinson', 'Isaia']] * 3})
      .set_index(['name', 'opponent']))

>>> df
                                                    nearest_neighbors
name       opponent                                                  
A.J. Price 76ers     [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]
           blazers   [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]
           bobcats   [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]

df.reset_index(inplace=True)
rows = []
_ = df.apply(lambda row: [rows.append([row['name'], row['opponent'], nn]) 
                         for nn in row.nearest_neighbors], axis=1)
df_new = pd.DataFrame(rows, columns=df.columns).set_index(['name', 'opponent'])

>>> df_new
                    nearest_neighbors
name       opponent                  
A.J. Price 76ers          Zach LaVine
           76ers           Jeremy Lin
           76ers        Nate Robinson
           76ers                Isaia
           blazers        Zach LaVine
           blazers         Jeremy Lin
           blazers      Nate Robinson
           blazers              Isaia
           bobcats        Zach LaVine
           bobcats         Jeremy Lin
           bobcats      Nate Robinson
           bobcats              Isaia

这篇关于如何将Dataframe单元格中的列表分解成单独的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆