pandas 融化2组柱子 [英] Pandas melt 2 groups of columns

查看:75
本文介绍了 pandas 融化2组柱子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张桌子

id   name  sec1   sec2   sec3  sec4  sec5  sec6
1    abc   mp(6)  up(7)  sp(8) cp(7)  2     4  

我想输出为

id   name   tests  values   slots
1    abc     mp     6        2
1    abc     up     7        2
1    abc     sp     8        2
1    abc     cp     7        2
1    abc     mp     6        4
1    abc     up     7        4
1    abc     sp     8        4
1    abc     cp     7        4

有人能帮我用melt来规范化吗?提前致谢

Can someone help me with normalising this with melt. Thanks in advance

推荐答案

首先我们从 set_indexstack() 开始.

First we start with set_index and stack().

然后我们提取测试值并将其放置在自己的列中并重置索引:

We then extract the test value and place it with its own column and reset the index:

s = df.set_index(['id','name'],append=True).stack().to_frame('tests')
s['values'] = s['tests'].str.extract('(\d+)')[0]
df1 = s.dropna(subset=['values']).reset_index([1,2,3])
df1['tests'] = df1['tests'].str.replace('\(.*\)','')


print(df1)
   id name level_3 tests values
0   1  abc    sec1    mp      6
0   1  abc    sec2    up      7
0   1  abc    sec3    sp      8
0   1  abc    sec4    cp      7


好的,现在我们需要一些与 SQL 中的交叉连接非常相似的东西,即根据插槽的 len 重复数据帧并相应地添加每个值.


Okay, now we need to something very similar to a cross join in SQL, that is repeat the dataframe according the len of the slots and add each value accordingly.

注意,这是制作数据框的产品,更好的方法是使用合并,但您没有明确输入,如多行的显示方式.

Note, this is making a product of your dataframe, a better way would be to use merge but you've not made your input clear, as in how multiple rows will appear.

让我们通过 pd.concat 尝试使用 keys 参数:

Let's try this with pd.concat making use of the keys argument:

vals = s.loc[s['values'].isna(),'tests'].values

final = pd.concat([df1 ]* len(vals),0,keys=vals)\
            .reset_index(0).rename(columns={'level_0' : 'slots'})


print(final[['id','name','tests','values','slots']])

   id name tests values  slots
0   1  abc    mp      6      2
0   1  abc    up      7      2
0   1  abc    sp      8      2
0   1  abc    cp      7      2
0   1  abc    mp      6      4
0   1  abc    up      7      4
0   1  abc    sp      8      4
0   1  abc    cp      7      4


使用 merge 避免一个总产品:

vals = s.loc[s['values'].isna(), 'tests'].reset_index([2,3],drop=True)\
                   .to_frame('slots').reset_index(1)


print(pd.merge(df1,vals,on=['id']))


   id name level_3 tests values slots
0   1  abc    sec1    mp      6     2
1   1  abc    sec1    mp      6     4
2   1  abc    sec2    up      7     2
3   1  abc    sec2    up      7     4
4   1  abc    sec3    sp      8     2
5   1  abc    sec3    sp      8     4
6   1  abc    sec4    cp      7     2
7   1  abc    sec4    cp      7     4

这篇关于 pandas 融化2组柱子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆