如何以这种方式删除行索引和展平索引 [英] How to drop row index and flatten index in this way

查看:47
本文介绍了如何以这种方式删除行索引和展平索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下 dfe :-

id       categ  level  cols           value   comment
1         A      PG    Apple           428    comment1 
1         A      CD    Apple           175    comment1 
1         C      PG    Apple           226    comment1 
1         C      AB    Apple           884    comment1 
1         C      CD    Apple           288    comment1 
1         B      PG    Apple           712    comment1 
1         B      AB    Apple           849    comment1 
2         B      CD    Apple           376    comment1 
2         C      None  Orange          591    comment1 
2         B      CD    Orange          135    comment1 
2         D      None  Orange          423    comment1 
2         A      AB    Orange          1e13   comment1 
2         D      PG    Orange          1e15   comment2 





   df2 = pd.DataFrame({'s2': {0: 1, 1: 2, 2: 3}, `level': {0: 'PG', 1: 'AB', 2: 'CD'}})
    df1 = pd.DataFrame({'sl': {0: 1, 1: 2, 2: 3, 3: 4}, 'set': {0: 'A', 1: 'C', 2: 'B', 3: 'D'}})
    dfe = (dfe[['categ','level','cols','id','comment','value']]
            .merge(df1.rename({'set' : 'categ'}, axis=1),how='left',on='categ')
            .merge(df2, how='left', on='level'))
    na = dfe['level'].isna()
    
    dfs = {'no_null': dfe[~na], 'null': dfe[na]}
    
    with pd.ExcelWriter('XYZ.xlsx') as writer: 
        
        for p,r in dfs.items():
            if p== 'no_null':
    
                c= ['cols','s2','level']
            else:
    
                 c = 'cols'
            
            df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values=['value'])
            df.columns = df.columns.droplevel([0,2])
            df  = df.reset_index().drop(('sl',''), axis=1).set_index('categ')
            
            
            for (id,comment), sdf in df.groupby(['id','comment']):
                df = sdf.reset_index(level=[1], drop=True).dropna(how='all', axis=1)
                df.to_excel(writer,sheet_name=name)

运行这个我得到的结果以这种方式显示在 excel 中:-

Running this I get results displayed in excel this way :-

我想以某种方式订购,我试过的:-

I want to order in certain way, what I tried :-

df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values='value')
            df.columns = df.columns.droplevel([1])
            df  = df.reset_index().drop(('sl',''), axis=1).set_index('categ')

这给了我 太多级别:索引只有 2 个级别,而不是 3 个 错误,我不知道我在这里遗漏了什么/错了什么.

This gives me Too many levels: Index has only 2 levels, not 3 error, I don't know what Im missing /wrong here .

我对标题排列的预期输出是:-

My expected output for arrangement of headings is :-

想知道是否可以按照预期输出中所示的方式在 CAPS 中将标题写入 excel.

Would like to know if headings can be written to excel in CAPS as shown in expected output.

编辑 1我尝试了答案,我得到了这个观点:-

EDIT 1 I tried the answer and Im getting this view :-

我希望能够显示 ID &COMMENT 仅一次(因为它已经在代码逻辑中按 ID 分组),并删除 sl 列和第一列 0,1,2 并删除0

I want to be able to display ID & COMMENT only once (as its already grouped by ID in code logic), and drop the sl column and the first column 0,1,2 and also delete the blank row above 0

推荐答案

鉴于 dfe:

   categ level    cols  id   comment         value  sl   s2
0      A    PG   Apple   1  comment1  4.280000e+02   1  1.0
1      A    CD   Apple   1  comment1  1.750000e+02   1  3.0
2      C    PG   Apple   1  comment1  2.260000e+02   2  1.0
3      C    AB   Apple   1  comment1  8.840000e+02   2  2.0
4      C    CD   Apple   1  comment1  2.880000e+02   2  3.0
5      B    PG   Apple   1  comment1  7.120000e+02   3  1.0
6      B    AB   Apple   1  comment1  8.490000e+02   3  2.0
7      B    CD   Apple   2  comment1  3.760000e+02   3  3.0
8      C  None  Orange   2  comment1  5.910000e+02   2  NaN
9      B    CD  Orange   2  comment1  1.350000e+02   3  3.0
10     D  None  Orange   2  comment1  4.230000e+02   4  NaN
11     A    AB  Orange   2  comment1  1.000000e+13   1  2.0
12     D    PG  Orange   2  comment2  1.000000e+15   4  1.0

然后尝试:

df = dfe.pivot_table(index=['id','comment','categ'], columns=c, values='value')
df.columns = df.columns.droplevel([1])

df = (df.rename_axis(columns=[None, None])
        .reset_index(col_level=1)
        .rename(columns = lambda x: x.upper()))
df.to_excel('testa1.xlsx')

输出:

注意事项:

  • 删除了 pivot_table 中value"周围的 [] 以不包括value"作为列索引.
  • 使用 col_level 参数将categ"、label"和comments"与列索引级别 1 对齐.
  • 请参阅这篇关于空行的博文,https://stackoverflow.com/a/52498899/6361531.
  • Removed [] around 'value' in pivot_table to not include 'value' as a column index.
  • Aligned 'categ', 'label' and 'comments' with column index level 1 using col_level parameter.
  • See this post about the blank line, https://stackoverflow.com/a/52498899/6361531.

这篇关于如何以这种方式删除行索引和展平索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆