pandas 融化2组柱子 [英] Pandas melt 2 groups of columns
问题描述
我有一张桌子
id name sec1 sec2 sec3 sec4 sec5 sec6
1 abc mp(6) up(7) sp(8) cp(7) 2 4
我想输出为
id name tests values slots
1 abc mp 6 2
1 abc up 7 2
1 abc sp 8 2
1 abc cp 7 2
1 abc mp 6 4
1 abc up 7 4
1 abc sp 8 4
1 abc cp 7 4
有人能帮我用melt来规范化吗?提前致谢
Can someone help me with normalising this with melt. Thanks in advance
推荐答案
首先我们从 set_index
和 stack()
开始.
First we start with set_index
and stack()
.
然后我们提取测试值并将其放置在自己的列中并重置索引:
We then extract the test value and place it with its own column and reset the index:
s = df.set_index(['id','name'],append=True).stack().to_frame('tests')
s['values'] = s['tests'].str.extract('(\d+)')[0]
df1 = s.dropna(subset=['values']).reset_index([1,2,3])
df1['tests'] = df1['tests'].str.replace('\(.*\)','')
print(df1)
id name level_3 tests values
0 1 abc sec1 mp 6
0 1 abc sec2 up 7
0 1 abc sec3 sp 8
0 1 abc sec4 cp 7
好的,现在我们需要一些与 SQL 中的交叉连接非常相似的东西,即根据插槽的 len
重复数据帧并相应地添加每个值.
Okay, now we need to something very similar to a cross join in SQL, that is repeat the dataframe according the len
of the slots and add each value accordingly.
注意,这是制作数据框的产品,更好的方法是使用合并,但您没有明确输入,如多行的显示方式.
Note, this is making a product of your dataframe, a better way would be to use merge but you've not made your input clear, as in how multiple rows will appear.
让我们通过 pd.concat
尝试使用 keys
参数:
Let's try this with pd.concat
making use of the keys
argument:
vals = s.loc[s['values'].isna(),'tests'].values
final = pd.concat([df1 ]* len(vals),0,keys=vals)\
.reset_index(0).rename(columns={'level_0' : 'slots'})
print(final[['id','name','tests','values','slots']])
id name tests values slots
0 1 abc mp 6 2
0 1 abc up 7 2
0 1 abc sp 8 2
0 1 abc cp 7 2
0 1 abc mp 6 4
0 1 abc up 7 4
0 1 abc sp 8 4
0 1 abc cp 7 4
使用 merge
避免一个总产品:
vals = s.loc[s['values'].isna(), 'tests'].reset_index([2,3],drop=True)\
.to_frame('slots').reset_index(1)
print(pd.merge(df1,vals,on=['id']))
id name level_3 tests values slots
0 1 abc sec1 mp 6 2
1 1 abc sec1 mp 6 4
2 1 abc sec2 up 7 2
3 1 abc sec2 up 7 4
4 1 abc sec3 sp 8 2
5 1 abc sec3 sp 8 4
6 1 abc sec4 cp 7 2
7 1 abc sec4 cp 7 4
这篇关于 pandas 融化2组柱子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!