从Pandas数据框单元格中将设置值拆分为多行 [英] Split set values from Pandas dataframe cell over multiple rows
本文介绍了从Pandas数据框单元格中将设置值拆分为多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下格式的pandas DataFrame:
I have a pandas DataFrame in the following form:
col1 col2
1 a {hu, fdf, ko, dss}
2 b {sdsjdn, lk}
3 c {sds, aldj, dhva}
现在,我想将设置值分成多行,使其看起来像这样:
Now I want to split the set values over multiple rows to make it look like this:
col1 col2
1 a hu
2 a fdf
3 a ko
4 a dss
5 b sdsjdn
6 b lk
7 c sds
8 c aldj
9 c dhva
任何人都知道我该怎么做?
Anyone has any insights how I can do this?
推荐答案
您需要 numpy.repeat
,用于创建新的重复列,并通过chain.from_iterable
将另一组列变平:
You need numpy.repeat
for create new duplicated column with flattening of another set column by chain.from_iterable
:
df = pd.DataFrame({ 'col1': ['a','b','c'],
'col2': [set({'hu', 'fdf', 'ko', 'dss'}),
set({'sdsjdn', 'lk'}),
set({'sds', 'aldj', 'dhva'})]})
print(df)
col1 col2
0 a {hu, dss, ko, fdf}
1 b {lk, sdsjdn}
2 c {dhva, aldj, sds}
from itertools import chain
df1 = pd.DataFrame({
"col1": np.repeat(df.col1.values, df.col2.str.len()),
"col2": list(chain.from_iterable(df.col2))})
print (df1)
col1 col2
0 a hu
1 a dss
2 a ko
3 a fdf
4 b lk
5 b sdsjdn
6 c dhva
7 c aldj
8 c sds
这篇关于从Pandas数据框单元格中将设置值拆分为多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文