Matplotlib 中的按列值着色 [英] Color by Column Values in Matplotlib

查看:49
本文介绍了Matplotlib 中的按列值着色的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 R 中使用 ggplot2 库时,我最喜欢的一个方面是能够轻松指定美学.我可以快速制作散点图并应用与特定列相关联的颜色,我希望能够使用 python/pandas/matplotlib 执行此操作.我想知道是否有任何方便的函数供人们使用 Pandas 数据框和 Matplotlib 将颜色映射到值?

##ggplot 散点图示例,带有 R 数据框,`df`,由 col3 着色ggplot(data = df, aes(x=col1, y=col2, color=col3)) + geom_point()##pandas 数据框的理想情况,'df',颜色由 col3 选择df.plot(x=col1,y=col2,color=col3)

感谢您的回复,但我想包含一个示例数据框来阐明我的要求.两列包含数字数据,第三列是分类变量.我正在考虑的脚本将根据此值分配颜色.

np.random.seed(250)df = pd.DataFrame({'Height': np.append(np.random.normal(6, 0.25, size=5), np.random.normal(5.4, 0.25, size=5)),'权重':np.append(np.random.normal(180, 20, size=5), np.random.normal(140, 20, size=5)),性别":[男"、男"、男"、男"、男"、女"、女"、女"、女"、女"]})身高体重性别0 5.824970 159.210508 男1 5.780403 180.294943 男2 6.318295 199.142201 男3 5.617211 157.813278 男4 6.340892 191.849944 男5 5.625131 139.588467 女6 4.950479 146.711220 女7 5.617245 121.571890 女8 5.556821 141.536028 女9 5.714171 134.396203 女

解决方案

导入和数据

import numpy进口大熊猫导入 matplotlib.pyplot 作为 plt进口seabornseaborn.set(style='ticks')numpy.random.seed(0)N = 37_genders= ['女性'、'男性'、'非二进制'、'无响应']df = 熊猫.DataFrame({'高度(厘米)':numpy.random.uniform(低=130,高=200,大小=N),'重量(公斤)':numpy.random.uniform(低=30,高=100,大小=N),'性别':numpy.random.choice(_genders, size=N)})

2021 年 8 月更新

  • 对于seaborn 0.11.0,建议使用新的图形级别函数,例如

    旧答案

    在这种情况下,我会直接使用 matplotlib.

    将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt将熊猫导入为 pddef dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):图, ax = plt.subplots()类别 = np.unique(df[catcol])颜色 = np.linspace(0, 1, len(categories))colordict = dict(zip(类别,颜色))df[颜色"] = df[catcol].apply(lambda x: colordict[x])ax.scatter(df[xcol], df[ycol], c=df.Color)返回无花果如果 1:df = pd.DataFrame({'Height':np.random.normal(size=10),'重量':np.random.normal(大小= 10),性别":[男"、男"、未知"、男"、男"、女性"、没有回应"、未知"、女性"、女性"]})图 = dfScatter(df)fig.savefig('fig1.png')

    这给了我:

    据我所知,颜色列可以是任何与 matplotlib 兼容的颜色(RBGA 元组、HTML 名称、十六进制值等).

    我无法获得除数值以外的任何东西来处理颜色图.

    One of my favorite aspects of using the ggplot2 library in R is the ability to easily specify aesthetics. I can quickly make a scatterplot and apply color associated with a specific column and I would love to be able to do this with python/pandas/matplotlib. I'm wondering if there are there any convenience functions that people use to map colors to values using pandas dataframes and Matplotlib?

    ##ggplot scatterplot example with R dataframe, `df`, colored by col3
    ggplot(data = df, aes(x=col1, y=col2, color=col3)) + geom_point()
    
    ##ideal situation with pandas dataframe, 'df', where colors are chosen by col3
    df.plot(x=col1,y=col2,color=col3)
    

    EDIT: Thank you for your responses but I want to include a sample dataframe to clarify what I am asking. Two columns contain numerical data and the third is a categorical variable. The script I am thinking of will assign colors based on this value.

    np.random.seed(250)
    df = pd.DataFrame({'Height': np.append(np.random.normal(6, 0.25, size=5), np.random.normal(5.4, 0.25, size=5)),
                       'Weight': np.append(np.random.normal(180, 20, size=5), np.random.normal(140, 20, size=5)),
                       'Gender': ["Male","Male","Male","Male","Male",
                                  "Female","Female","Female","Female","Female"]})
    
         Height      Weight  Gender
    0  5.824970  159.210508    Male
    1  5.780403  180.294943    Male
    2  6.318295  199.142201    Male
    3  5.617211  157.813278    Male
    4  6.340892  191.849944    Male
    5  5.625131  139.588467  Female
    6  4.950479  146.711220  Female
    7  5.617245  121.571890  Female
    8  5.556821  141.536028  Female
    9  5.714171  134.396203  Female
    

    解决方案

    Imports and Data

    import numpy 
    import pandas
    import matplotlib.pyplot as plt
    import seaborn
    seaborn.set(style='ticks')
    
    numpy.random.seed(0)
    N = 37
    _genders= ['Female', 'Male', 'Non-binary', 'No Response']
    df = pandas.DataFrame({
        'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
        'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
        'Gender': numpy.random.choice(_genders, size=N)
    })
    

    Update August 2021

    • With seaborn 0.11.0, it's recommended to use new figure level functions like seaborn.relplot than to use FacetGrid directly.

    seaborn.relplot(data=df, x='Weight (kg)', y='Height (cm)', hue='Gender', hue_order=_genders, aspect=1.61)
    plt.show()
    

    Update October 2015

    Seaborn handles this use-case splendidly:

    fg = seaborn.FacetGrid(data=df, hue='Gender', hue_order=_genders, aspect=1.61)
    fg.map(plt.scatter, 'Weight (kg)', 'Height (cm)').add_legend()
    

    Which immediately outputs:

    Old Answer

    In this case, I would use matplotlib directly.

    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    
    def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):
        fig, ax = plt.subplots()
        categories = np.unique(df[catcol])
        colors = np.linspace(0, 1, len(categories))
        colordict = dict(zip(categories, colors))  
    
        df["Color"] = df[catcol].apply(lambda x: colordict[x])
        ax.scatter(df[xcol], df[ycol], c=df.Color)
        return fig
    
    if 1:
        df = pd.DataFrame({'Height':np.random.normal(size=10),
                           'Weight':np.random.normal(size=10),
                           'Gender': ["Male","Male","Unknown","Male","Male",
                                      "Female","Did not respond","Unknown","Female","Female"]})    
        fig = dfScatter(df)
        fig.savefig('fig1.png')
    

    And that gives me:

    As far as I know, that color column can be any matplotlib compatible color (RBGA tuples, HTML names, hex values, etc).

    I'm having trouble getting anything but numerical values to work with the colormaps.

    这篇关于Matplotlib 中的按列值着色的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆