如何将Dataframe单元格中的列表分解成单独的行 [英] How to explode a list inside a Dataframe cell into separate rows
问题描述
所以,请执行以下操作:
如果我要解压缩并在nearest_neighbors列,以便每个值都将是每个对手索引中的一行,我最好如何处理这些?有没有熊猫方法是为这样的操作?我只是不知道。
提前感谢,
在下面的代码中,我首先将索引重置为
我创建一个列表,其中外部列表的每个元素都是目标DataFrame的一行,内部列表的每个元素都是一个的列,这个嵌套列表最终将被连接以创建所需的Da taFrame。
我使用 lambda
函数和列表迭代来创建一个 nearest_neighbors
与相关的名称
和对手
配对。最后,我从此列表中创建了一个新的DataFrame(使用原始的列名称并将索引设置回名称
和对手
)。
df =(pd.DataFrame({ 'name':['AJ Price'] * 3,
'opponent':['76ers','blazers','bobcats'],
'nearest_neighbors':[['Zach LaVine' 'Jeremy Lin','Nate Robinson','Isaia']] * 3})
.set_index(['name','opponent']))
>> ; df
nearest_neighbors
名称对手
A.J.价格76人[Zach LaVine,林书豪,Nate Robinson,以赛亚]
开拓者[Zach LaVine,林书豪,Nate Robinson,Isaia]
bobcats [Zach LaVine,Jeremy Lin,Nate Robinson,Isaia]
df.reset_index(inplace = True)
rows = []
_ = df.apply(lambda row:[rows.append([row ['name'],row [ 'opponent'],nn])
在row.nearest_neighbors中的nn],轴= 1)
df_new = pd.DataFrame(rows,columns = df.columns).set_index(['name', 'opponent'])
>>> df_new
nearest_neighbors
名称对手
A.J.价格76ers Zach LaVine
76ers Jeremy Lin
76ers Nate Robinson
76ers Isaia
西装Zach LaVine
西装外套Jeremy Lin
西装外套Nate Robinson
西装外套Isaia
bobcats Zach LaVine
bobcats Jeremy Lin
bobcats Nate Robinson
bobcats Isaia
I'm looking to turn a pandas cell containing a list into rows for each of those values.
So, take this:
If I'd like to unpack and stack the values in the 'nearest_neighbors" column so that each value would be a row within each 'opponent' index, how would I best go about this? Are there pandas methods that are meant for operations like this? I'm just not aware.
Thanks in advance, guys.
In the code below, I first reset the index to make the row iteration easier.
I create a list of lists where each element of the outer list is a row of the target DataFrame and each element of the inner list is one of the columns. This nested list will ultimately be concatenated to create the desired DataFrame.
I use a lambda
function together with list iteration to create a row for each element of the nearest_neighbors
paired with the relevant name
and opponent
.
Finally, I create a new DataFrame from this list (using the original column names and setting the index back to name
and opponent
).
df = (pd.DataFrame({'name': ['A.J. Price'] * 3,
'opponent': ['76ers', 'blazers', 'bobcats'],
'nearest_neighbors': [['Zach LaVine', 'Jeremy Lin', 'Nate Robinson', 'Isaia']] * 3})
.set_index(['name', 'opponent']))
>>> df
nearest_neighbors
name opponent
A.J. Price 76ers [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]
blazers [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]
bobcats [Zach LaVine, Jeremy Lin, Nate Robinson, Isaia]
df.reset_index(inplace=True)
rows = []
_ = df.apply(lambda row: [rows.append([row['name'], row['opponent'], nn])
for nn in row.nearest_neighbors], axis=1)
df_new = pd.DataFrame(rows, columns=df.columns).set_index(['name', 'opponent'])
>>> df_new
nearest_neighbors
name opponent
A.J. Price 76ers Zach LaVine
76ers Jeremy Lin
76ers Nate Robinson
76ers Isaia
blazers Zach LaVine
blazers Jeremy Lin
blazers Nate Robinson
blazers Isaia
bobcats Zach LaVine
bobcats Jeremy Lin
bobcats Nate Robinson
bobcats Isaia
这篇关于如何将Dataframe单元格中的列表分解成单独的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!