MatPlotlib Seaborn多个绘图格式 [英] MatPlotlib Seaborn Multiple Plots formatting

查看:59
本文介绍了MatPlotlib Seaborn多个绘图格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将一组R可视化转换为Python.我有以下目标R多个绘图直方图:

使用 Matplotlib 和 Seaborn 组合并在一位 StackOverflow 成员的帮助下(参见链接:

我对它的外观感到满意,除了,我不知道如何将标头信息放入绘图中.这是我创建Python图表的Python代码

 "绘制采样直方图分布的程序""将熊猫作为pd导入将numpy导入为np导入matplotlib.pyplot作为plt从matplotlib.backends.backend_pdf导入PdfPages将 seaborn 作为 sns 导入定义主():"采样直方图程序的主例程""sns.set_style('whitegrid')markers_list = ["s","o","*","^","+"]# 将数据数据框创建为 df_origdf_orig = pd.read_csv('lab_samples.csv')df_orig = df_orig.loc[df_orig.hra != -9999]hra_list_unique = df_orig.hra.unique().tolist()# 创建和子集 df_hra_colors 以匹配 df_orig 中的实际 hra 颜色df_hra_colors = pd.read_csv('hra_lookup.csv')df_hra_colors ['hex'] = np.vectorize(rgb_to_hex)(df_hra_colors ['red'],df_hra_colors ['green'],df_hra_colors ['blue'])df_hra_colors.drop(labels = [''red','green','blue'],axis = 1,inplace = True)df_hra_colors = df_hra_colors.loc [df_hra_colors ['hra'].isin(hra_list_unique)]#在这里将current_component硬编码为pc1,我们将通过循环对其进行扩展# 通过组件列表current_component = 'pc1'num_tests = 5df_columns = df_orig.columns.tolist()start_index = 5用于范围(num_tests)中的测试:current_tests_list = df_columns[start_index:(start_index + num_tests)]#现在为每种HRA颜色创建sns distplots,并覆盖测试我 = 1对于 _,df_hra_colors.iterrows() 中的行:plt.subplot(3,3,i)select_columns = ['hra',current_component] + current_tests_listdf_current_color = df_orig.loc [df_orig ['hra'] == row ['hra'],select_columns]y_data = df_current_color.loc[df_current_color[current_component] != -9999, current_component]axs = sns.distplot(y_data, color=row['hex'],hist_kws = {"ec":"k"},kde_kws={"color": "k", "lw": 0.5})data_x,data_y = axs.lines [0] .get_data()axs.text(0.0, 1.0, row['hra'], horizo​​ntalalignment="left", fontsize='x-small',verticalalignment ="top",transform = axs.transAxes)对于枚举(current_tests_list)中的current_test_index,current_test:#this_x定义了此测试的current_component(pc1,pc2,rhob)系列#表示为1,相应的R程序调用此test_vectorx_series = df_current_color.loc[df_current_color[current_test] == 1, current_component].tolist()对于x_series中的this_x:this_y = np.interp(this_x,data_x,data_y)axs.plot([this_x],[this_y-current_test_index * 0.05],标记列表[current_test_index],标记大小= 3,颜色=黑色")axs.xaxis.label.set_visible(False)axs.xaxis.set_tick_params(labelsize = 4)axs.yaxis.set_tick_params(labelsize=4)我=我+ 1start_index = start_index + num_tests#plt.show()pp = PdfPages('plots.pdf')pp.savefig()pp.close()def rgb_to_hex(红、绿、蓝):"""为给定的颜色值返回颜色为#rrggbb."""返回 '​​#%02x%02x%02x' %(红、绿、蓝)如果 __name__ == "__main__":主要的()

Pandas 代码运行良好,并且正在做它应该做的事情.我缺乏在Matplotlib中使用"PdfPages"的知识和经验是瓶颈.如何在 Python/Matplotlib/Seaborn 中显示我可以在相应的 R 可视化中显示的标题信息.通过标题信息,我的意思是 R 可视化在直方图之前的顶部有什么,即pc1"、MRP、XRD,...

我可以从我的程序中轻松获取它们的值,例如,current_component 是 'pc1' 等.但我不知道如何使用标题格式化绘图.有人可以提供一些指导吗?

解决方案

您可能正在寻找图标题或超级标题,

标题中的其余信息将被称为legend.下面让我们假设所有子图都具有相同的标记.然后为其中一个子图创建一个图例就足够了.要创建图例标签,可以将 label 自变量放置到绘图中,即

axs.plot( ... , label="MRP")

稍后调用 axs.legend()时,将自动生成带有相应标签的图例.详细介绍了定位图例的方法,例如在这个答案中.
在这里,您可能希望根据图形坐标放置图例,即

ax.legend(loc="lower center",bbox_to_anchor=(0.5,0.8),bbox_transform=plt.gcf().transFigure)

I am translating a set of R visualizations to Python. I have the following target R multiple plot histograms:

Using Matplotlib and Seaborn combination and with the help of a kind StackOverflow member (see the link: Python Seaborn Distplot Y value corresponding to a given X value), I was able to create the following Python plot:

I am satisfied with its appearance, except, I don't know how to put the Header information in the plots. Here is my Python code that creates the Python Charts

""" Program to draw the sampling histogram distributions """
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import seaborn as sns

def main():
    """ Main routine for the sampling histogram program """
    sns.set_style('whitegrid')
    markers_list = ["s", "o", "*", "^", "+"]
    # create the data dataframe as df_orig
    df_orig = pd.read_csv('lab_samples.csv')
    df_orig = df_orig.loc[df_orig.hra != -9999]
    hra_list_unique = df_orig.hra.unique().tolist()
    # create and subset df_hra_colors to match the actual hra colors in df_orig
    df_hra_colors = pd.read_csv('hra_lookup.csv')
    df_hra_colors['hex'] = np.vectorize(rgb_to_hex)(df_hra_colors['red'], df_hra_colors['green'], df_hra_colors['blue'])
    df_hra_colors.drop(labels=['red', 'green', 'blue'], axis=1, inplace=True)
    df_hra_colors = df_hra_colors.loc[df_hra_colors['hra'].isin(hra_list_unique)]

    # hard coding the current_component to pc1 here, we will extend it by looping
    # through the list of components
    current_component = 'pc1'
    num_tests = 5
    df_columns = df_orig.columns.tolist()
    start_index = 5
    for test in range(num_tests):
        current_tests_list = df_columns[start_index:(start_index + num_tests)]
        # now create the sns distplots for each HRA color and overlay the tests
        i = 1
        for _, row in df_hra_colors.iterrows():
            plt.subplot(3, 3, i)
            select_columns = ['hra', current_component] + current_tests_list
            df_current_color = df_orig.loc[df_orig['hra'] == row['hra'], select_columns]
            y_data = df_current_color.loc[df_current_color[current_component] != -9999, current_component]
            axs = sns.distplot(y_data, color=row['hex'],
                               hist_kws={"ec":"k"},
                               kde_kws={"color": "k", "lw": 0.5})
            data_x, data_y = axs.lines[0].get_data()
            axs.text(0.0, 1.0, row['hra'], horizontalalignment="left", fontsize='x-small',
                     verticalalignment="top", transform=axs.transAxes)
            for current_test_index, current_test in enumerate(current_tests_list):
                # this_x defines the series of current_component(pc1,pc2,rhob) for this test
                # indicated by 1, corresponding R program calls this test_vector
                x_series = df_current_color.loc[df_current_color[current_test] == 1, current_component].tolist()
                for this_x in x_series:
                    this_y = np.interp(this_x, data_x, data_y)
                    axs.plot([this_x], [this_y - current_test_index * 0.05],
                             markers_list[current_test_index], markersize = 3, color='black')
            axs.xaxis.label.set_visible(False)
            axs.xaxis.set_tick_params(labelsize=4)
            axs.yaxis.set_tick_params(labelsize=4)
            i = i + 1
        start_index = start_index + num_tests
    # plt.show()
    pp = PdfPages('plots.pdf')
    pp.savefig()
    pp.close()

def rgb_to_hex(red, green, blue):
    """Return color as #rrggbb for the given color values."""
    return '#%02x%02x%02x' % (red, green, blue)

if __name__ == "__main__":
    main()

The Pandas code works fine and it is doing what it is supposed to. It is my lack of knowledge and experience of using 'PdfPages' in Matplotlib that is the bottleneck. How can I show the header information in Python/Matplotlib/Seaborn that I can show in the corresponding R visalization. By the Header information, I mean What The R visualization has at the top before the histograms, i.e., 'pc1', MRP, XRD,....

I can get their values easily from my program, e.g., current_component is 'pc1', etc. But I don't know how to format the plots with the Header. Can someone provide some guidance?

解决方案

You may be looking for a figure title or super title, fig.suptitle:

fig.suptitle('this is the figure title', fontsize=12)

In your case you can easily get the figure with plt.gcf(), so try

plt.gcf().suptitle("pc1")

The rest of the information in the header would be called a legend. For the following let's suppose all subplots have the same markers. It would then suffice to create a legend for one of the subplots. To create legend labels, you can put the labelargument to the plot, i.e.

axs.plot( ... , label="MRP")

When later calling axs.legend() a legend will automatically be generated with the respective labels. Ways to position the legend are detailed e.g. in this answer.
Here, you may want to place the legend in terms of figure coordinates, i.e.

ax.legend(loc="lower center",bbox_to_anchor=(0.5,0.8),bbox_transform=plt.gcf().transFigure)

这篇关于MatPlotlib Seaborn多个绘图格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆