Plotly:如何创建具有多色标签的时间序列变量的线图? [英] Plotly: How to create a line plot of a time series variable that has a multiple-color label?

查看:105
本文介绍了Plotly:如何创建具有多色标签的时间序列变量的线图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将 datframe 作为 df,其中有一列我在 y 下作为参数"传递,如下所示,并将其与变量时间"作图.此变量在颜色下传递的标签"列下有 2 个标签.

I have datframe as df which has a column that I am passing under y as 'parameter' as shown below and it to be plotted against variable 'time'. This variable has 2 labels under the column 'labels' which is passed under the color.

import plotly.express as px
fig= px.line(data_frame= df,  x='time', y='parameter', color='labels')

请找到我为图表附加的图片.两张图片的变量相同,但第二张图片是第一张图片的放大版本,以获得更好的想法.

Please find the images I have attached for the graph. Both images are of the same variable, but 2nd image is zoomed version of the first one to get better idea.

如您所见,我正在根据时间绘制一个变量,并期望为 2 个标签使用不同的颜色,plotly 在图形中以蓝色和红色提供 2 条单独的线,这看起来非常混乱和错误.我应该做哪些更改才能获得 2 种不同颜色的连续图形?

As you can see, I am plotting one variable against time and expecting separate colors for 2 labels, plotly is giving 2 separate lines in graph in color blue and red which looks quite messy and wrong. What changes should I make to have one continuous graph in 2 separate colors?

更多解释:我不希望蓝线穿过红图(请参阅附图),反之亦然,因为我只绘制了一张图.我想要如图 3 所示的图形.提前致谢.

More explanation: I do not want the blue line running through red graph (please refer the attached images) and vice versa as I am plotting only one graph. I want graph as shown 3rd image. Thank you in advance.

推荐答案

第二个建议

(请进一步阅读我的第一个建议以了解一些假设和条件)

我已经设法构建了一种方法,它几乎应该涵盖您在这里要求的所有内容.唯一提供真正挑战的细节是如何可视化跟踪之间的差距,因为我的第二个建议是基于为每个 label 添加唯一跟踪.您可能会怀疑这可能会用一堆重复的名称填充图例,但这是通过按关联标签对跟踪名称进行分组来解决的.我还设置了一个字典,您可以在其中为每个标签指定颜色.结果如下:

I've managed to build an approach that pretty much should cover all you're asking for here. The only detail that provides a real challenge is how gaps between traces are visualized sinc my second suggestion builds on adding a unique trace for each single label. You may suspect that this would potentially fill the legend with a bunch of duplicate names, but that is taken care of by grouping trace names by the associated label. I've also set up a dictionary where you can specify colors for each label. This is the result:

注意到灰线了吗?这就是连通性"的结果.我之前描述的问题.您可以通过在 color='rgba(200,200,200,0.2)' 中设置不透明度参数(最后一个数字)来选择隐藏或显示该行.您将找到一个完整的代码片段来重现下图.有很多事情要一起调整整件事,所以如果有任何不清楚的地方,请不要犹豫询问细节.

Notice the grey line? That's the result of the "connectivity" problem I described earlier. You can chose to hide or show that line by setting the opacity parameter (last number) in color='rgba(200,200,200,0.2)'. You'll find a complete code snippet to reproduce this figure below. There's a lot going on there to tweak this whole thing togeteher, so don't hesitate to ask about the details if anything is unclear.

# imports
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import random

# settings
observations = 100
np.random.seed(5)
value = np.random.uniform(low=-1, high=1, size=observations).tolist()
time = [t for t in pd.date_range('2020', freq='D', periods=observations)]#.format()]

df=pd.DataFrame({'time': time, 
                 'value':value})
df['value']=df['value'].cumsum()
df1 = df.copy()
df1=df1.set_index('time')

# custom function to build labels as conditions of parameter values
def classify(e):
    if e > 0.75: return 'high'
    if e > 0.25: return 'medium'
    if e >= 0: return 'low'
    
# custom function to set mode = line or marker, given data length
def modes(df):
    if len(df) > 1: return 'lines'
    else: return  'markers'
    
# dictionary to specify marker or line color
# this will depend on your real world labels !!!
cols = {'high': 'green',
         'medium': 'blue',
         'low': 'red'}

df['label1'] = [(elem-df['value'].min())/(df['value'].max()-df['value'].min()) for elem in df['value']]
df['label'] = [classify(elem) for elem in df['label1']]
df = df.drop('label1', 1)

df['group'] = df['label'].ne(df['label'].shift()).cumsum()
df = df.groupby('group')
dfs = []
for name, data in df:
    dfs.append(data)

fig = go.Figure()
# one line to connect them all
fig=go.Figure((go.Scatter(x=df1.index, y=df1['value'],
                          name = 'all data',
                          line=dict(color='rgba(200,200,200,0.7)'))))

showed = []
for frame in dfs:

    if frame['label'].iloc[0] not in showed:
        
        fig.add_trace(go.Scatter(x=frame['time'], y = frame['value'],
                                 mode = modes(frame),
                                 marker_color = cols[frame['label'].iloc[0]],
                                 legendgroup=frame['label'].iloc[0],
                                 name=frame['label'].iloc[0]))
        showed.append(frame['label'].iloc[0])
    else:
        fig.add_trace(go.Scatter(x=frame['time'], y = frame['value'],
                                 mode = modes(frame),
                                  marker_color = cols[frame['label'].iloc[0]],
                                 legendgroup=frame['label'].iloc[0],
                                 name=frame['label'].iloc[0],
                                 showlegend=False
                                ))
fig.update_layout(template='plotly_dark')
fig.update_xaxes(showgrid=False)
fig.update_layout(uirevision='constant')
fig.show()

第一个建议

您应该如何执行此操作在很大程度上取决于数据集的结构.根据你的问题,我只能猜测它看起来像这样:

First suggestion

How you should do this would depend highly on the structure of your dataset. By the sound of your question, I can only guess that it looks something like this:

         time     param   label
0  2020-01-01 -0.556014  medium
1  2020-01-02  0.185451    high
2  2020-01-03 -0.401111  medium
3  2020-01-04  0.436111    high
4  2020-01-05  0.412933    high
5  2020-01-06  0.636421    peak
6  2020-01-07  1.168237    peak
7  2020-01-08  1.205073    peak
8  2020-01-09  0.798674    peak
9  2020-01-10  0.174116    high

如果是这样,那么如果您想用不同颜色的线迹显示 param ,那么您很快就会遇到数据点之间奇怪的连接问题.首先想到的是将一行一种颜色与多种颜色的标记组合起来,如下所示:

If so, then yon can quickly run into a problem of a weird connectivity between your datapoints if you'd like to display param with a line trace with different colors. The first thing that comes to mind is to combine a line of one color, with markers of multiple colors like this:

这将为您提供一个很好的交互性,您可以在其中打开和关闭所有元素,也许只研究数据中 label=='peak 的部分:

This will give you a nice interactivity where you can switch all elements on and off, perhaps to study only the parts of your data where label=='peak:

让我知道这对您有什么影响,我们可以讨论更多细节.您可以在此处找到数据样本和所有详细信息:

Let me know how this works out for you and we can talk some more details. You'll find a data sample and all details here:

# imports
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import random

# settings
observations = 100
np.random.seed(5); cols = list('a')
param = np.random.uniform(low=-1, high=1, size=observations).tolist()
time = [t for t in pd.date_range('2020', freq='D', periods=observations).format()]

df=pd.DataFrame({'time': time, 
                 'param':param})
df['param']=df['param'].cumsum()

def classify(e):
    if e > 0.9: return 'peak'
    if e > 0.75: return 'high'
    if e > 0.25: return 'medium'
    if e > 0.9: return 'low'
    if e >= 0: return 'bottom'

df['label1'] = [(elem-df['param'].min())/(df['param'].max()-df['param'].min()) for elem in df['param']]
df['label'] = [classify(elem) for elem in df['label1']]
df = df.drop('label1', 1)

fig=go.Figure((go.Scatter(x=df['time'], y=df['param'],
                         mode='lines',
                         line=dict(color='rgba(0,0,200,0.7)'))))
fig.add_traces(px.scatter(df, x='time', y='param', color='label').data)
fig.update_layout(template='plotly_dark')
fig.update_xaxes(showgrid=False)
fig.show()
         

这篇关于Plotly:如何创建具有多色标签的时间序列变量的线图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆