复制单元格包含多个数据的行 [英] Duplicating rows where a cell contains multiple pieces of data
本文介绍了复制单元格包含多个数据的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想获取一个数据框并复制某些行.
一列称为name
,可能具有多个名称.
下面是一个示例数据框:
I would like to take a dataframe and duplicate certain rows.
One column, called name
, may have multiple names.
An example dataframe is contructed below:
data = [
['Joe', '17-11-2018', '2'],
['Karen', '17-11-2018', '4'],
['Bill, Avery', '17-11-2018', '6'],
['Sam', '18-11-2018', '4'],
['Alex, Frank', '18-11-2018', '6'],
['Chris', '18-11-2018', '8'],
]
df = pd.DataFrame(data, columns = ['name','date','number'])
这将产生以下数据帧:
name date number
0 Joe 17-11-2018 2
1 Karen 17-11-2018 4
2 Bill, Avery 17-11-2018 6
3 Sam 18-11-2018 4
4 Alex, Frank 18-11-2018 6
5 Chris 18-11-2018 8
我想对所有具有多个名称(用逗号分隔)的行进行处理,并为每个单独的名称重复它们.结果数据框应如下所示:
I would like to take all rows where there are multiple names (comma-separated) and duplicate them for each individual name. The resulting dataframe should look like this:
name date number
0 Joe 17-11-2018 2
1 Karen 17-11-2018 4
2 Bill 17-11-2018 6
3 Avery 17-11-2018 6
4 Sam 18-11-2018 4
5 Alex 18-11-2018 6
6 Frank 18-11-2018 6
7 Chris 18-11-2018 8
推荐答案
After str.split
, it become a unnest
problem
df['name']=df.name.str.split(',')
unnesting(df,['name'])
Out[97]:
name date number
0 Joe 17-11-2018 2
1 Karen 17-11-2018 4
2 Bill 17-11-2018 6
2 Avery 17-11-2018 6
3 Sam 18-11-2018 4
4 Alex 18-11-2018 6
4 Frank 18-11-2018 6
5 Chris 18-11-2018 8
def unnesting(df, explode):
idx = df.index.repeat(df[explode[0]].str.len())
df1 = pd.concat([
pd.DataFrame({x: np.concatenate(df[x].values)}) for x in explode], axis=1)
df1.index = idx
return df1.join(df.drop(explode, 1), how='left')
这篇关于复制单元格包含多个数据的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文