思路要超过内存/ CPU的效率特定的时间间隔分析近实时数据 [英] Ideas wanted for analyzing near-realtime data over specific intervals with memory/cpu efficiency

查看:201
本文介绍了思路要超过内存/ CPU的效率特定的时间间隔分析近实时数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些环境传感器,我要检测温度的突然变化,并随着时间的推移缓慢的趋势。不过我最喜欢做的基于内存与可能看起来像这样的参数什么是数学: (如有变更)



(注:括号的项目计算在将数据添加实时)




  • 5分钟(导数,最大值,最小值,平均值)+ 36数据点的最新3小时



  • <李>每小时(导数,最大值,最小值,平均值)+ 24数据点对大多数电流为1天
  • 日报(导数,最大值,最小值,平均值)+ 32数据点对大多数当月

  • 每月(导数,最大值,最小值,平均值)+ 12数据点过去一年



每个数据点是两个字节浮动。因此,每个传感器将消耗高达124花车,加上24计算的变量。我想支持尽可能多的传感器作为.NET embededd设备将允许。



由于我使用的是嵌入式设备为这个项目,我的记忆是有限的,并所以是我的IO和CPU处理能力。



您会如何去使用.NET实施呢?到目前为止,我已经创建了几个结构,并把它称为一个 TrackableFloat 其中一个值的插入导致老一落阵列和一个重新计算完成。




这使这更
复杂得多,它是唯一的事情是,
任何传感器不报到
数据,那么该数据点必须是
排除/从所有后续
实时calulations忽略。




当一切都说过和做过,如果任何值:(导数,最大值,最小值,平均值)达到预先定义的设置,那么.NET事件触发



我觉得有人在那里会认为这是一个有趣的问题,很想听听他们将如何处理实施它。




  1. 你会使用一类还是结构?


  2. 你会如何触发计算? (活动最有可能的)


  3. 如何将警报被触发?


  4. 会怎样您存储数据,里三层外三层?


  5. 有没有已经这样做的东西像这样的图书馆吗? (也许这应该是我的第一个问题)


  6. 您将如何有效地计算导数?




下面是我在该第一裂缝,并且它不完全打规范,但是非常有效的。会有兴趣听你的想法。

 枚举UnitToTrackEnum 
{
分,
FiveMinute ,
TenMinute,
FifteenMinute,
小时,
节,
周,
月,
未知
}
类TrackableFloat
{
门对象=新的对象();

UnitToTrackEnum trackingMode = UnitToTrackEnum.unknown;
INT ValidFloats = 0;
浮法[] FloatsToTrack;

公共TrackableFloat(INT HistoryLength,UnitToTrackEnum unitToTrack)
{
如果(unitToTrack == UnitToTrackEnum.unknown)
抛出新的InvalidOperationException异常(你不能有一个未知的时间测量跟踪);

FloatsToTrack =新的浮动[HistoryLength]

的foreach(VAR我在FloatsToTrack)
{
浮法[I] = float.MaxValue;
}


trackingMode = unitToTrack;

最小= float.MaxValue;
最大值= float.MinValue;
总和= 0;
}
公共无效添加(DateTime的DT,浮点值)
{
INT RoundedDTUnit = 0;

开关(trackingMode)
{
情况下UnitToTrackEnum.Minute:
{
RoundedDTUnit = dt.Minute;
中断;
}
情况下UnitToTrackEnum.FiveMinute:
{
RoundedDTUnit = System.Math.Abs​​(dt.Minute / 5);
中断;
}
情况下UnitToTrackEnum.TenMinute:
{
RoundedDTUnit = System.Math.Abs​​(dt.Minute / 10);
中断;
}
情况下UnitToTrackEnum.FifteenMinute:
{
RoundedDTUnit = System.Math.Abs​​(dt.Minute / 15);
中断;
}
情况下UnitToTrackEnum.Hour:
{
RoundedDTUnit = dt.Hour;
中断;
}
情况下UnitToTrackEnum.Day:
{
RoundedDTUnit = dt.Day;
中断;
}
情况下UnitToTrackEnum.Week:
{
// RoundedDTUnit = System.Math.Abs​​();
中断;
}
情况下UnitToTrackEnum.Month:
{
RoundedDTUnit = dt.Month;
中断;
}
情况下UnitToTrackEnum.unknown:
{
抛出新的InvalidOperationException异常(你一定没有时间一个未知的措施进行跟踪。);
}
默认:
中断;
}


布尔DoRefreshMaxMin = FALSE;
如果(FloatsToTrack.Length< RoundedDTUnit)
{
如果(价值== || float.MaxValue值== float.MinValue)
{
//如果无效数据...
锁(门)
{
//摆脱旧数据...
VAR =的OldValue FloatsToTrack [RoundedDTUnit]
如果(=的OldValue || float.MaxValue =的OldValue float.MinValue!)
{
总和 - =的OldValue;
ValidFloats--;

如果(==的OldValue最大|| ==的OldValue敏)
DoRefreshMaxMin = TRUE;
}

//保存新的数据
FloatsToTrack [RoundedDTUnit] =价值;
}
}
,否则
{
锁(门)
{
//摆脱旧的数据... $ B $的b VAR =的OldValue FloatsToTrack [RoundedDTUnit]
如果(=的OldValue || float.MaxValue =的OldValue float.MinValue!)
{
总和 - =的OldValue;
ValidFloats--;
}

//保存新的数据
FloatsToTrack [RoundedDTUnit] =价值;
总和+ =价值;
ValidFloats ++;

如果(价值<最小值)
最小=价值;

如果(价值>最大)
最大值=价值;

如果(==的OldValue最大|| ==的OldValue敏)
DoRefreshMaxMin = TRUE;
}
}

//函数放在这里,以避免死锁
如果(DoRefreshMaxMin ==真)
RefreshMaxMin();
}
,否则
{
抛出新IndexOutOfRangeException(「指数」+ RoundedDTUnit +越界追踪模式:+ trackingMode.ToString());
}
}

公众持股量的总和{搞定;组; }
公众持股量平均
{
得到
{
如果(ValidFloats大于0)
收益金额/ ValidFloats;
,否则
返回float.MaxValue;
}
}
公众持股量最少{搞定;组; }
公众持股量最大{搞定;组; }
公众持股量微分{搞定;组; }

公共无效RefreshCounters()
{
锁(门)
{
浮动总和= 0;
ValidFloats = 0;

最小= float.MaxValue;
最大值= float.MinValue;

的foreach(VAR我在FloatsToTrack)
{
如果(我!= float.MaxValue ||我!= float.MinValue)
{
如果(最小== float.MaxValue)
{
最小= I;
最大值= I;
}

总和+ =我;
ValidFloats ++;

如果(I<最小值)
最小= I;

如果(I>最大)
最大值= I;
}
}
总和=总和;
}
}
公共无效RefreshMaxMin()
{
如果(ValidFloats大于0)
{
最小= float.MaxValue;
最大值= float.MinValue;

锁(门)
{
的foreach(VAR我在FloatsToTrack)
{
如果(我!= float.MaxValue ||我!= float.MinValue)
{
如果(I<最小值)
最小= I;

如果(I>最大)
最大值= I;
}
}
}
}
}
}


解决方案

你应该考虑在寻找一个 CEP 库如 Nesper


I have some environmental sensors and I want to detect sudden changes in temperature, and slow trends over time... however I'd like to do most of the math based on what's in memory with parameters that may look like this: (subject to change)

(note: Items in parens are computed in realtime as data is added)

  • 5 minute (derivative, max, min, avg) + 36 datapoints for most current 3 hours
  • 10 minute (derivative, max, min, avg) + 0 datapoints, calc is based on 5min sample
  • 30 minute (derivative, max, min, avg) + 0 datapoints, calc is based on 5 min sample
  • Hourly (derivative, max, min, avg) + 24 datapoints for most current 1 day
  • Daily (derivative, max,min,avg) + 32 datapoints for most current month
  • Monthly (derivative, max,min,avg) + 12 datapoints for past year

Each datapoint is a two byte float. So each sensor will consume up to 124 Floats, plus the 24 calculated variables. I'd like to support as many sensors as the .NET embededd device will permit.

Since I'm using an embedded device for this project, my memory is constrained and so is my IO and CPU power.

How would you go about implementing this in .NET? So far, I've created a couple of structs and called it a "TrackableFloat" where the insertion of a value causes the old one to drop off the array and a recalculation is done.

The only thing that makes this more complicated than it would be, is that for any sensor does not report back data, then that datapoint needs to be excluded/ignored from all subsequent realtime calulations.

When all is said and done, if any of the values: (derivative, max,min,avg) reach a pre defined setting, then a .NET event fires

I think someone out there will think this is an interesting problem, and would love to hear how they would approach implementing it.

  1. Would you use a Class or a Struct?

  2. How would you trigger the calculations? (Events most likely)

  3. How would the alerts be triggered?

  4. How would you store the data, in tiers?

  5. Is there a library that already does something like this? (maybe that should be my first question )

  6. How would you efficiently calculate the derivative?

Here is my first crack at this, and it doesn't completely hit the spec, but is very efficient. Would be interested in hearing your thoughts.

enum UnitToTrackEnum
{
    Minute,
    FiveMinute,
    TenMinute,
    FifteenMinute,
    Hour,
    Day,
    Week,
    Month,
    unknown
}
class TrackableFloat
{
    object Gate = new object();

    UnitToTrackEnum trackingMode = UnitToTrackEnum.unknown;
    int ValidFloats = 0;
    float[] FloatsToTrack;

    public TrackableFloat(int HistoryLength, UnitToTrackEnum unitToTrack)
    {
        if (unitToTrack == UnitToTrackEnum.unknown)
            throw new InvalidOperationException("You must not have an unknown measure of time to track.");

        FloatsToTrack = new float[HistoryLength];

        foreach (var i in FloatsToTrack)
        {
            float[i] = float.MaxValue;
        }


        trackingMode = unitToTrack;

        Min = float.MaxValue;
        Max = float.MinValue;
        Sum = 0;
    }
    public void Add(DateTime dt, float value)
    {
        int RoundedDTUnit = 0;

        switch (trackingMode)
        {
            case UnitToTrackEnum.Minute:
                {
                    RoundedDTUnit = dt.Minute;
                    break;
                }
            case UnitToTrackEnum.FiveMinute:
                {
                    RoundedDTUnit = System.Math.Abs(dt.Minute / 5);
                    break;
                }
            case UnitToTrackEnum.TenMinute:
                {
                    RoundedDTUnit = System.Math.Abs(dt.Minute / 10);
                    break;
                }
            case UnitToTrackEnum.FifteenMinute:
                {
                    RoundedDTUnit = System.Math.Abs(dt.Minute / 15);
                    break;
                }
            case UnitToTrackEnum.Hour:
                {
                    RoundedDTUnit = dt.Hour;
                    break;
                }
            case UnitToTrackEnum.Day:
                {
                    RoundedDTUnit = dt.Day;
                    break;
                }
            case UnitToTrackEnum.Week:
                {
                    //RoundedDTUnit = System.Math.Abs( );
                    break;
                }
            case UnitToTrackEnum.Month:
                {
                    RoundedDTUnit = dt.Month;
                    break;
                }
            case UnitToTrackEnum.unknown:
                {
                    throw new InvalidOperationException("You must not have an unknown measure of time to track.");
                }
            default:
                break;
        }


        bool DoRefreshMaxMin = false;
        if (FloatsToTrack.Length < RoundedDTUnit)
        {
            if (value == float.MaxValue || value == float.MinValue)
            {
                // If invalid data...
                lock (Gate)
                {
                    // Get rid of old data...
                    var OldValue = FloatsToTrack[RoundedDTUnit];
                    if (OldValue != float.MaxValue || OldValue != float.MinValue)
                    {
                        Sum -= OldValue;
                        ValidFloats--;

                        if (OldValue == Max || OldValue == Min)
                            DoRefreshMaxMin = true;
                    }

                    // Save new data
                    FloatsToTrack[RoundedDTUnit] = value;
                }
            }
            else
            {
                lock (Gate)
                {
                    // Get rid of old data...
                    var OldValue = FloatsToTrack[RoundedDTUnit];
                    if (OldValue != float.MaxValue || OldValue != float.MinValue)
                    {
                        Sum -= OldValue;
                        ValidFloats--;
                    }

                    // Save new data
                    FloatsToTrack[RoundedDTUnit] = value;
                    Sum += value;
                    ValidFloats++;

                    if (value < Min)
                        Min = value;

                    if (value > Max)
                        Max = value;

                    if (OldValue == Max || OldValue == Min)
                        DoRefreshMaxMin = true;
                }
            }

            // Function is placed here to avoid a deadlock
            if (DoRefreshMaxMin == true)
                RefreshMaxMin();
        }
        else
        {
            throw new IndexOutOfRangeException("Index " + RoundedDTUnit + " is out of range for tracking mode: " + trackingMode.ToString());
        }
    }

    public float Sum { get; set; }
    public float Average
    {
        get
        {
            if (ValidFloats > 0)
                return Sum / ValidFloats;
            else
                return float.MaxValue;
        }
    }
    public float Min { get; set; }
    public float Max { get; set; }
    public float Derivative { get; set; }

    public void RefreshCounters()
    {
        lock (Gate)
        {
            float sum = 0;
            ValidFloats = 0;

            Min = float.MaxValue;
            Max = float.MinValue;

            foreach (var i in FloatsToTrack)
            {
                if (i != float.MaxValue || i != float.MinValue)
                {
                    if (Min == float.MaxValue)
                    {
                        Min = i;
                        Max = i;
                    }

                    sum += i;
                    ValidFloats++;

                    if (i < Min)
                        Min = i;

                    if (i > Max)
                        Max = i;
                }
            }
            Sum = sum;
        }
    }
    public void RefreshMaxMin()
    {
        if (ValidFloats > 0)
        {
            Min = float.MaxValue;
            Max = float.MinValue;

            lock (Gate)
            {
                foreach (var i in FloatsToTrack)
                {
                    if (i != float.MaxValue || i != float.MinValue)
                    {
                        if (i < Min)
                            Min = i;

                        if (i > Max)
                            Max = i;
                    }
                }
            }
        }
    }
}

解决方案

You should consider looking at a CEP library like Nesper.

这篇关于思路要超过内存/ CPU的效率特定的时间间隔分析近实时数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆