如何使用多个分类变量归一化海洋计数图 [英] How to normalize a seaborn countplot with multiple categorical variables
本文介绍了如何使用多个分类变量归一化海洋计数图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我为一个数据框的多个类别变量创建了一个难以置信的countplot
,但我不想使用百分比而不是计数?
I have created a seaborn countplot
for multiple categorical variables of a dataframe but instead of count I want to have percentages?
最佳选择是什么? plo子?我可以使用以下查询来一次获取条形图吗?
What is the best option to use? Barplots? Can I use a query like the below one to get the barplots at once?
for i, col in enumerate(df_categorical.columns):
plt.figure(i)
sns.countplot(x=col,hue='Response',data=df_categorical)
该查询一次为我提供所有变量的countplot
this query gives me the countplot
for all variables at once
谢谢!
数据如下:
State Response Coverage Education Effective To Date EmploymentStatus Gender Location Code Marital Status Policy Type Policy Renew Offer Type Sales Channel Vehicle Class Vehicle Size
0 Washington No Basic Bachelor 2/24/11 Employed F Suburban Married Corporate Auto Corporate L3 Offer1 Agent Two-Door Car Medsize
1 Arizona No Extended Bachelor 1/31/11 Unemployed F Suburban Single Personal Auto Personal L3 Offer3 Agent Four-Door Car Medsize
2 Nevada No Premium Bachelor 2/19/11 Employed F Suburban Married Personal Auto Personal L3 Offer1 Agent Two-Door Car Medsize
3 California No Basic Bachelor 1/20/11 Unemployed M Suburban Married Corporate Auto Corporate L2 Offer1 Call Center SUV Medsize
4 Washington No Basic Bachelor 2/3/11 Employed M Rural Single Personal Auto Personal L1 Offer1 Agent Four-Door Car Medsize
推荐答案
考虑一个groupby.transform
来计算百分比列,然后将barplot
与 x 一起用于原始值列和 y 表示百分比列.
Consider a groupby.transform
to calculate percentage column, then run barplot
with x for original value column and y for percent column.
数据 (仅将原始发布数据的两个否"转换为是")
from io import StringIO
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
txt = '''
State Response Coverage Education "Effective To Date" EmploymentStatus Gender "Location Code" "Marital Status" "Policy Type" Policy "Renew Offer Type" "Sales Channel" "Vehicle Class" "Vehicle Size"
0 Washington No Basic Bachelor "2/24/11" Employed F Suburban Married "Corporate Auto" "Corporate L3" Offer1 Agent "Two-Door Car" Medsize
1 Arizona No Extended Bachelor "1/31/11" Unemployed F Suburban Single "Personal Auto" "Personal L3" Offer3 Agent "Four-Door Car" Medsize
2 Nevada Yes Premium Bachelor "2/19/11" Employed F Suburban Married "Personal Auto" "Personal L3" Offer1 Agent "Two-Door Car" Medsize
3 California No Basic Bachelor "1/20/11" Unemployed M Suburban Married "Corporate Auto" "Corporate L2" Offer1 "Call Center" SUV Medsize
4 Washington Yes Basic Bachelor "2/3/11" Employed M Rural Single "Personal Auto" "Personal L1" Offer1 Agent "Four-Door Car" Medsize'''
df_categorical = pd.read_table(StringIO(txt), sep="\s+")
图 (两列中多个图的单个图)
fig = plt.figure(figsize=(10,30))
for i, col in enumerate(df_categorical.columns):
# PERCENT COLUMN CALCULATION
df_categorical[col+'_pct'] = df_categorical.groupby(['Response', col])[col]\
.transform(lambda x: len(x)) / len(df_categorical)
plt.subplot(8, 2, i+1)
sns.barplot(x=col, y=col+'_pct', hue='Response', data=df_categorical)\
.set(xlabel=col, ylabel='Percent')
plt.tight_layout()
plt.show()
plt.clf()
plt.close('all')
这篇关于如何使用多个分类变量归一化海洋计数图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文