基于 Python 中的多个条件突出显示数据框单元格 [英] Highlight dataframe cells based on multiple conditions in Python
本文介绍了基于 Python 中的多个条件突出显示数据框单元格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
给定一个小数据集如下:
id 房间面积情况0 1 A-102 世界在建1 2 NaN 24 在建2 3 B309 NaN NaN3 4 C·102 25 下装修4 5 E_1089 你好,装修中5 6 27 计划中的 NaN6 7 27 NaN NaN
感谢来自@jezrael 的代码
我怎么能在 Pandas(更好)或其他 Python 包中做到这一点?
提前致谢.
解决方案
想法是创建自定义函数返回DataFrame
样式并重用m1, m2, m3
boolean面具:
m1 = df.room.str.match('^[a-zA-Z\d\-]*$', na = False)m2 = df.area.str.contains('^\d+$', na = True)m3 = df.situation.str.contains('装饰中', na = False)a = np.where(m1, None, '房间名称不正确')b = np.where(m2, None, '面积不是数字')c = np.where(m3, '装饰在内容', None)f = (lambda x: '; '.join(y for y in x if pd.notna(y))如果有的话(pd.notna(np.array(x))) 否则 np.nan )df['check'] = [f(x) for x in zip(a, b, c)]打印(df)定义突出显示(x):c1 = '背景颜色:黄色'df1 = pd.DataFrame('', index=x.index, columns=x.columns)df1['房间'] = np.where(m1, '', c1)df1['area'] = np.where(m2, '', c1)df1['situation'] = np.where(m3, c1, '')# 打印(df1)返回 df1df.style.apply(highlight, axis = None).to_excel('test.xlsx', index = False)
Given a small dataset as follows:
id room area situation
0 1 A-102 world under construction
1 2 NaN 24 under construction
2 3 B309 NaN NaN
3 4 C·102 25 under decoration
4 5 E_1089 hello under decoration
5 6 27 NaN under plan
6 7 27 NaN NaN
Thanks to the code from @jezrael at this link, I'm able to get the result I needed:
a = np.where(df.room.str.match('^[a-zA-Z\d\-]*$', na = False), None,
'incorrect room name')
b = np.where(df.area.str.contains('^\d+$', na = True), None,
'area is not a numbers')
c = np.where(df.situation.str.contains('under decoration', na = False),
'decoration is in the content', None)
f = (lambda x: '; '.join(y for y in x if pd.notna(y))
if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a,b,c)]
print(df)
id room area situation \
0 1 A-102 world under construction
1 2 NaN 24 under construction
2 3 B309 NaN NaN
3 4 C·102 25 under decoration
4 5 E_1089 hello under decoration
5 6 27 NaN under plan
6 7 27 NaN NaN
check
0 area is not a numbers
1 incorrect room name
2 NaN
3 incorrect room name;decoration is in the content
4 incorrect room name;area is not a numbers;deco...
5 NaN
6 NaN
But now I would like to go further and hightlight the problematic cells from room, area, situation
columns, then save the dataframe as excel file.
How could I do that in Pandas (better) or other Python packages?
Thanks at advance.
解决方案
Idea is to create customized function for return DataFrame
of styles and reuse m1, m2, m3
boolean masks:
m1 = df.room.str.match('^[a-zA-Z\d\-]*$', na = False)
m2 = df.area.str.contains('^\d+$', na = True)
m3 = df.situation.str.contains('under decoration', na = False)
a = np.where(m1, None, 'incorrect room name')
b = np.where(m2, None, 'area is not a numbers')
c = np.where(m3, 'decoration is in the content', None)
f = (lambda x: '; '.join(y for y in x if pd.notna(y))
if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a, b, c)]
print(df)
def highlight(x):
c1 = 'background-color: yellow'
df1 = pd.DataFrame('', index=x.index, columns=x.columns)
df1['room'] = np.where(m1, '', c1)
df1['area'] = np.where(m2, '', c1)
df1['situation'] = np.where(m3, c1, '')
# print(df1)
return df1
df.style.apply(highlight, axis = None).to_excel('test.xlsx', index = False)
这篇关于基于 Python 中的多个条件突出显示数据框单元格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文