带多种配色方案的带注释的热图 [英] Annotated heatmap with multiple color schemes

查看:110
本文介绍了带多种配色方案的带注释的热图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框,并希望通过热图中的不同配色方案区分每个步骤"中的较小十进制差异.

I have the following dataframe and would like to differentiate the minor decimal differences in each "step" with a different color scheme in a heatmap.

样本数据:

Sample  Step 2  Step 3  Step 4  Step 5  Step 6  Step 7  Step 8
A   64.847  54.821  20.897  39.733  23.257  74.942  75.945
B   64.885  54.767  20.828  39.613  23.093  74.963  75.928
C   65.036  54.772  20.939  39.835  23.283  74.944  75.871
D   64.869  54.740  21.039  39.889  23.322  74.925  75.894
E   64.911  54.730  20.858  39.608  23.101  74.956  75.930
F   64.838  54.749  20.707  39.394  22.984  74.929  75.941
G   64.887  54.781  20.948  39.748  23.238  74.957  75.909
H   64.903  54.720  20.783  39.540  23.028  74.898  75.911
I   64.875  54.761  20.911  39.695  23.082  74.897  75.866
J   64.839  54.717  20.692  39.377  22.853  74.849  75.939
K   64.857  54.736  20.934  39.699  23.130  74.880  75.903
L   64.754  54.746  20.777  39.536  22.991  74.877  75.902
M   64.798  54.811  20.963  39.824  23.187  74.886  75.895

我要寻找的示例:

推荐答案

我的第一种方法是基于具有多个子图的图形.地块数等于数据框中的列数;情节之间的差距可以缩小到零:

My first approach would be based on a figure with multiple subplots. Number of plots would equal number of columns in your dataframe; the gap between the plots could be shrinked down to zero:

cm = ['Blues', 'Reds', 'Greens', 'Oranges', 'Purples', 'bone', 'winter']
f, axs = plt.subplots(1, df.columns.size, gridspec_kw={'wspace': 0})
for i, (s, a, c) in enumerate(zip(df.columns, axs, cm)):
    sns.heatmap(np.array([df[s].values]).T, yticklabels=df.index, xticklabels=[s], annot=True, fmt='.2f', ax=a, cmap=c, cbar=False)
    if i>0:
        a.yaxis.set_ticks([])

结果:

不确定是否会导致有用的甚至是自我描述的数据可视化,但这是您的选择-也许这有助于启动...

Not sure if this will lead to a helpful or even self describing visualization of data, but that's your choice - perhaps this helps to start...

补充:

关于添加颜色条:当然可以.但是-除了不知道您的数据背景和可视化的目的之外,我还想对所有这些添加一些想法:

Regarding adding the colorbars: of course you can. But - besides not knowing the background of your data and the purpose of the visualization - I'd like to add some thoughts on all that:

第一:将所有这些颜色条作为单独的一组条形添加到热图的一侧或下方是可能的,但是我发现读取数据已经非常困难,而且:所有这些注释-我想这一切都弄糟了.
另外:在此期间,@ ImportanceOfBeingErnest提供了关于该主题的如此漂亮的解决方案,以至于在这里这并不是太有意义.

First: adding all those colorbars as a separate bunch of bars on one side or below the heatmap is probably possible, but I find it already quite hard to read the data, plus: you already have all those annotations - it would mess all up I think.
Additionally: in the meantime @ImportanceOfBeingErnest provided such a beutiful solution on that topic, that this would be not too meaningful imo here.

第二:如果您真的想坚持使用热图技术,也许可以拆分并为每个列提供其颜色条会更适合:

Second: if you really want to stick to the heatmap thing, perhaps splitting up and giving every column its colorbar would suit better:

cm = ['Blues', 'Reds', 'Greens', 'Oranges', 'Purples', 'bone', 'winter']
f, axs = plt.subplots(1, df.columns.size, figsize=(10, 3))
for i, (s, a, c) in enumerate(zip(df.columns, axs, cm)):
    sns.heatmap(np.array([df[s].values]).T, yticklabels=df.index, xticklabels=[s], annot=True, fmt='.2f', ax=a, cmap=c)
    if i>0:
        a.yaxis.set_ticks([])
f.tight_layout()

但是,所有这些-我敢怀疑这是您数据的最佳可视化.当然,我不知道您想对这些图说什么,看到或找到什么,但这就是要点:如果可视化类型适合需求,我想我会知道(至少可以想象).

However, all that said - I dare to doubt that this is the best visualization for your data. Of course, I don't know what you want to say, see or find with these plots, but that's the point: if the visualization type would fit to the needs, I guess I'd know (or at least could imagine).

例如:
一个简单的 df.plot()产生

我认为,这比热图能在十分之一秒的时间内显示出更多有关您列的不同特征的信息.

and I feel that this tells more about different characteristics of your columns within some tenths of a second than the heatmap.

还是您明确地追求每列均值的差异?

Or are you explicitely after the differences to each columns' means?

(df - df.mean()).plot()

...还是围绕它们的每一列的分布?

... or the distribution of each column around them?

(df - df.mean()).boxplot()

我想说的是:在开始/必须解释任何事情之前,当情节开始向某人讲述底层数据时,数据可视化将变得强大起来.

What I want to say: data visualization becomes powerful when a plot begins to tell sth about the underlying data before you begin/have to explain anything...

这篇关于带多种配色方案的带注释的热图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆