如何在bokeh/holoviews热图图中正确处理日期时间和分类轴? [英] How to properly handle datetime and categorical axes in bokeh/holoviews heatmap plot?

查看:50
本文介绍了如何在bokeh/holoviews热图图中正确处理日期时间和分类轴?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用bokeh/holoviews绘制一个简单的热图.我的数据(pandas数据框)具有分类(在y上)和日期时间(在x上).问题在于类别元素的数量> 3000,并且结果图在y轴上显示出混乱的重叠行情线,使其完全无用.当前,在bokeh中是否有一种可靠的方法可以根据缩放级别仅选择股票的一个子集?

我已经进行了密谋,结果看起来很完美,但是我需要使用bokeh/holoviews和datashader.我也想避免用数字行情替换类别.

我也尝试过

I'm trying to plot a simple heatmap using bokeh/holoviews. My data (pandas dataframe) has categoricals (on y) and datetime (on x). The problem is that the number of categorical elements is >3000 and the resulting plot appears with messed overlapped tickers on the y axis that makes it totally useless. Currently, is there a reliable way in bokeh to select only a subset of the tickers based on the zoom level?

I've already tried plotly and the result looks perfect but however I need to use bokeh/holoviews and datashader. I want also avoid to replace categoricals with numericals tickers.

I've also tried this solution but actually it doesn't work (bokeh 1.2.0).

This is a toy example representing my use case (Actually here #y is 1000 but it gives the idea)

from datetime import datetime
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.transform import linear_cmap
from bokeh.io import output_notebook

output_notebook()

# build sample data
index = pd.date_range(start='1/1/2019', periods=1000, freq='T')
data = np.random.rand(1000,100)
columns = ['col'+ str(n) for n in range(100)]

# initial data format
df = pd.DataFrame(data=data, index=index, columns=columns)

# bokeh
df = df.stack().reset_index()
df.rename(columns={'level_0':'x','level_1':'y', 0:'z'},inplace=True)
df.sort_values(by=['y'],inplace=True)

x = [
     date.to_datetime64().astype('M8[ms]').astype('O')
     for date in df.x.to_list()
]

data = {
    'value': df.z.to_list(),
    'x': x,
    'y': df.y.to_list(), 
    'date' : df.x.to_list()
}

p = figure(x_axis_type='datetime', y_range=columns, width=900, tooltips=[("x", "@date"), ("y", "@y"), ("value", "@value")])

p.rect(x='x', y='y', width=60*1000, height=1, line_color=None, 
                   fill_color=linear_cmap('value', 'Viridis256', low=df.z.min(), high=df.z.max()), source=data)

show(p)

解决方案

Finally, I partially followed the suggestion from James and managed to get it to work using a python callback for the ticker. This solution was hard to find for me. I really searched all the Bokeh docs, examples and source code for days.

The main problem for me is that in the doc is not mentioned how I can use "ColumnDataSource" objects in the custom callback.

https://docs.bokeh.org/en/1.2.0/docs/reference/models/formatters.html#bokeh.models.formatters.FuncTickFormatter.from_py_func

Finally, this helped a lot:

https://docs.bokeh.org/en/1.2.0/docs/user_guide/interaction/callbacks.html#customjs-with-a-python-function.

So, I modified the original code as follow in the hope it can be useful to someone:

from datetime import datetime
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.transform import linear_cmap
from bokeh.io import output_notebook
from bokeh.models import FuncTickFormatter
from bokeh.models import ColumnDataSource

output_notebook()

# build sample data
index = pd.date_range(start='1/1/2019', periods=1000, freq='T')
data = np.random.rand(1000,100)
columns_labels = ['col'+ str(n) for n in range(100)]
columns = [n for n in range(100)]

# initial data format
df = pd.DataFrame(data=data, index=index, columns=columns)

# bokeh
df = df.stack().reset_index()
df.rename(columns={'level_0':'x','level_1':'y', 0:'z'},inplace=True)
df.sort_values(by=['y'],inplace=True)

x = [
     date.to_datetime64().astype('M8[ms]').astype('O')
     for date in df.x.to_list()
]

data = {
    'value': df.z.to_list(),
    'x': x,
    'y': df.y.to_list(), 
    'y_labels_tooltip' : [columns_labels[k] for k in df.y.to_list()],
    'y_ticks' : columns_labels*1000,
    'date' : df.x.to_list()
}

cd = ColumnDataSource(data=data)

def ticker(source=cd):
    labels = source.data['y_ticks']
    return "{}".format(labels[tick])

#p = figure(x_axis_type='datetime', y_range=columns, width=900, tooltips=[("x", "@date{%F %T}"), ("y", "@y_labels"), ("value", "@value")])

p = figure(x_axis_type='datetime', width=900, tooltips=[("x", "@date{%F %T}"), ("y", "@y_labels_tooltip"), ("value", "@value")])

p.rect(x='x', y='y', width=60*1000, height=1, line_color=None, 
                   fill_color=linear_cmap('value', 'Viridis256', low=df.z.min(), high=df.z.max()), source=cd)

p.hover.formatters = {'date': 'datetime'}
p.yaxis.formatter = FuncTickFormatter.from_py_func(ticker)
p.yaxis[0].ticker.desired_num_ticks = 20

show(p)

The result is this:

这篇关于如何在bokeh/holoviews热图图中正确处理日期时间和分类轴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆