Matplotlib中按列值着色 [英] Color by Column Values in Matplotlib

查看:82
本文介绍了Matplotlib中按列值着色的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中使用ggplot2库的我最喜欢的方面之一是能够轻松指定美观.我可以快速创建散点图并应用与特定列关联的颜色,我希望能够使用python/pandas/matplotlib做到这一点.我想知道人们是否使用了一些便利功能来使用pandas数据框和Matplotlib将颜色映射到值?

One of my favorite aspects of using the ggplot2 library in R is the ability to easily specify aesthetics. I can quickly make a scatterplot and apply color associated with a specific column and I would love to be able to do this with python/pandas/matplotlib. I'm wondering if there are there any convenience functions that people use to map colors to values using pandas dataframes and Matplotlib?

##ggplot scatterplot example with R dataframe, `df`, colored by col3
ggplot(data = df, aes(x=col1, y=col2, color=col3)) + geom_point()

##ideal situation with pandas dataframe, 'df', where colors are chosen by col3
df.plot(x=col1,y=col2,color=col3)

感谢您的答复,但我想提供一个示例数据框来阐明我的要求.两列包含数值数据,第三列是类别变量.我正在考虑的脚本将基于该值分配颜色.

Thank you for your responses but I want to include a sample dataframe to clarify what I am asking. Two columns contain numerical data and the third is a categorical variable. The script I am thinking of will assign colors based on this value.

import pandas as pd
df = pd.DataFrame({'Height':np.random.normal(10),
                   'Weight':np.random.normal(10),
                   'Gender': ["Male","Male","Male","Male","Male",
                              "Female","Female","Female","Female","Female"]})

推荐答案

2015年10月更新

Seaborn出色地处理了这个用例:

Update October 2015

Seaborn handles this use-case splendidly:

import numpy 
import pandas
from  matplotlib import pyplot
import seaborn
seaborn.set(style='ticks')

numpy.random.seed(0)
N = 37
_genders= ['Female', 'Male', 'Non-binary', 'No Response']
df = pandas.DataFrame({
    'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
    'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
    'Gender': numpy.random.choice(_genders, size=N)
})

fg = seaborn.FacetGrid(data=df, hue='Gender', hue_order=_genders, aspect=1.61)
fg.map(pyplot.scatter, 'Weight (kg)', 'Height (cm)').add_legend()

立即输出:

在这种情况下,我将直接使用matplotlib.

In this case, I would use matplotlib directly.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):
    fig, ax = plt.subplots()
    categories = np.unique(df[catcol])
    colors = np.linspace(0, 1, len(categories))
    colordict = dict(zip(categories, colors))  

    df["Color"] = df[catcol].apply(lambda x: colordict[x])
    ax.scatter(df[xcol], df[ycol], c=df.Color)
    return fig

if 1:
    df = pd.DataFrame({'Height':np.random.normal(size=10),
                       'Weight':np.random.normal(size=10),
                       'Gender': ["Male","Male","Unknown","Male","Male",
                                  "Female","Did not respond","Unknown","Female","Female"]})    
    fig = dfScatter(df)
    fig.savefig('fig1.png')

那给了我

据我所知,该颜色列可以是任何与matplotlib兼容的颜色(RBGA元组,HTML名称,十六进制值等).

As far as I know, that color column can be any matplotlib compatible color (RBGA tuples, HTML names, hex values, etc).

我很难获得除数值以外的任何颜色来使用颜色图.

I'm having trouble getting anything but numerical values to work with the colormaps.

这篇关于Matplotlib中按列值着色的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆