如何以这种方式删除行索引和展平索引 [英] How to drop row index and flatten index in this way
问题描述
我有以下 dfe
:-
id categ level cols value comment
1 A PG Apple 428 comment1
1 A CD Apple 175 comment1
1 C PG Apple 226 comment1
1 C AB Apple 884 comment1
1 C CD Apple 288 comment1
1 B PG Apple 712 comment1
1 B AB Apple 849 comment1
2 B CD Apple 376 comment1
2 C None Orange 591 comment1
2 B CD Orange 135 comment1
2 D None Orange 423 comment1
2 A AB Orange 1e13 comment1
2 D PG Orange 1e15 comment2
df2 = pd.DataFrame({'s2': {0: 1, 1: 2, 2: 3}, `level': {0: 'PG', 1: 'AB', 2: 'CD'}})
df1 = pd.DataFrame({'sl': {0: 1, 1: 2, 2: 3, 3: 4}, 'set': {0: 'A', 1: 'C', 2: 'B', 3: 'D'}})
dfe = (dfe[['categ','level','cols','id','comment','value']]
.merge(df1.rename({'set' : 'categ'}, axis=1),how='left',on='categ')
.merge(df2, how='left', on='level'))
na = dfe['level'].isna()
dfs = {'no_null': dfe[~na], 'null': dfe[na]}
with pd.ExcelWriter('XYZ.xlsx') as writer:
for p,r in dfs.items():
if p== 'no_null':
c= ['cols','s2','level']
else:
c = 'cols'
df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values=['value'])
df.columns = df.columns.droplevel([0,2])
df = df.reset_index().drop(('sl',''), axis=1).set_index('categ')
for (id,comment), sdf in df.groupby(['id','comment']):
df = sdf.reset_index(level=[1], drop=True).dropna(how='all', axis=1)
df.to_excel(writer,sheet_name=name)
运行这个我得到的结果以这种方式显示在 excel 中:-
Running this I get results displayed in excel this way :-
我想以某种方式订购,我试过的:-
I want to order in certain way, what I tried :-
df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values='value')
df.columns = df.columns.droplevel([1])
df = df.reset_index().drop(('sl',''), axis=1).set_index('categ')
这给了我 太多级别:索引只有 2 个级别,而不是 3 个
错误,我不知道我在这里遗漏了什么/错了什么.
This gives me Too many levels: Index has only 2 levels, not 3
error, I don't know what Im missing /wrong here .
我对标题排列的预期输出是:-
My expected output for arrangement of headings is :-
想知道是否可以按照预期输出中所示的方式在 CAPS 中将标题写入 excel.
Would like to know if headings can be written to excel in CAPS as shown in expected output.
编辑 1我尝试了答案,我得到了这个观点:-
EDIT 1 I tried the answer and Im getting this view :-
我希望能够显示 ID
&COMMENT
仅一次(因为它已经在代码逻辑中按 ID
分组),并删除 sl
列和第一列 0,1,2
并删除0
I want to be able to display ID
& COMMENT
only once (as its already grouped by ID
in code logic), and drop the sl
column and the first column 0,1,2
and also delete the blank row above 0
推荐答案
鉴于 dfe:
categ level cols id comment value sl s2
0 A PG Apple 1 comment1 4.280000e+02 1 1.0
1 A CD Apple 1 comment1 1.750000e+02 1 3.0
2 C PG Apple 1 comment1 2.260000e+02 2 1.0
3 C AB Apple 1 comment1 8.840000e+02 2 2.0
4 C CD Apple 1 comment1 2.880000e+02 2 3.0
5 B PG Apple 1 comment1 7.120000e+02 3 1.0
6 B AB Apple 1 comment1 8.490000e+02 3 2.0
7 B CD Apple 2 comment1 3.760000e+02 3 3.0
8 C None Orange 2 comment1 5.910000e+02 2 NaN
9 B CD Orange 2 comment1 1.350000e+02 3 3.0
10 D None Orange 2 comment1 4.230000e+02 4 NaN
11 A AB Orange 2 comment1 1.000000e+13 1 2.0
12 D PG Orange 2 comment2 1.000000e+15 4 1.0
然后尝试:
df = dfe.pivot_table(index=['id','comment','categ'], columns=c, values='value')
df.columns = df.columns.droplevel([1])
df = (df.rename_axis(columns=[None, None])
.reset_index(col_level=1)
.rename(columns = lambda x: x.upper()))
df.to_excel('testa1.xlsx')
输出:
注意事项:
- 删除了
pivot_table
中value"周围的 [] 以不包括value"作为列索引. - 使用
col_level
参数将categ"、label"和comments"与列索引级别 1 对齐. - 请参阅这篇关于空行的博文,https://stackoverflow.com/a/52498899/6361531.
- Removed [] around 'value' in
pivot_table
to not include 'value' as a column index. - Aligned 'categ', 'label' and 'comments' with column index level 1 using
col_level
parameter. - See this post about the blank line, https://stackoverflow.com/a/52498899/6361531.
这篇关于如何以这种方式删除行索引和展平索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!