检测数据集中的重大变化,然后逐渐变化 [英] Detect significant changes in a data-set that gradually changes

查看:78
本文介绍了检测数据集中的重大变化,然后逐渐变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在python中有一个数据列表,代表每分钟使用的资源量。我想找到它在该数据集中发生显着变化的次数。我所说的重大更改与到目前为止所读内容有所不同。



例如如果我有一个数据集,例如
[10,15,17,20,30,40,50,70,80,60,40,20]



我说当数据比以前的正常值增加一倍或减少一半时,会发生重大变化。



例如因为列表以10开头,所以这是我们的起始法线点。



然后,当数据加倍到20时,我认为这是一项重大更改,并将法线设置为20。 / p>

然后,当数据增加一倍至40时,它被认为是显着的变化,现在的正常值为40



然后数据翻倍到80,这被认为是一个重大变化,现在正常值是80



此后,当数据减少一半到40时,它被认为是另一个重大变化,正常变为40



最后,当数据减少一半至20时,这是最后一次重大更改



此处一共有5个重大更改。



它是否与其他任何更改检测算法相似?

解决方案

这是相对简单的。您可以通过列表进行一次迭代来完成此操作。当发生重大变化时,我们只需更新我们的基准即可。



请注意,我的实现适用于任何可迭代的或容器的容器。例如,如果您想要读取文件而不必将其全部加载到内存中,则此功能很有用。

  def gen_significant_changes (可迭代,*,tol = 2):
可迭代=可迭代(可迭代)#如果它是容器而不是生成器,则这是必需的。
#注意,如果iterable已经是一个生成器,则iter(iterable)将返回自身。
base = next(iterable)
for x in iterable:
如果x> =(base * tol)或x< =(base / tol):
yield x
base = x

my_list = [10,15,17,20,30,40,50,70,80,60,40,20]

打印(列表(gen_significant_changes(my_list)))


I have a list of data in python that represents amount of resources used per minute. I want to find the number of times it changes significantly in that data set. What I mean by significant change is a bit different from what I've read so far.

For e.g. if I have a dataset like [10,15,17,20,30,40,50,70,80,60,40,20]

I say a significant change happens when data increases by double or reduces by half with respect to the previous normal.

For e.g. since the list starts with 10, that is our starting normal point

Then when data doubles to 20, I count that as one significant change and set the normal to 20.

Then when data doubles to 40, it is considered a significant change and the normal is now 40

Then when data doubles to 80, it is considered a significant change and the normal is now 80

After that when data reduces by half to 40, it is considered as another significant change and the normal becomes 40

Finally when data reduces by half to 20, it is the last significant change

Here there are a total of 5 significant changes.

Is it similar to any other change detection algorithm? How can this be done efficiently in python?

解决方案

This is relatively straightforward. You can do this with a single iteration through the list. We simply update our base when a 'significant' change occurs.

Note that my implementation will work for any iterable or container. This is useful if you want to, for example, read through a file without having to load it all into memory.

def gen_significant_changes(iterable, *, tol = 2):
    iterable = iter(iterable) # this is necessary if it is container rather than generator.
    # note that if the iterable is already a generator iter(iterable) returns itself.
    base = next(iterable)
    for x in iterable:
        if x >= (base * tol) or x <= (base/tol):
            yield x
            base = x

my_list = [10,15,17,20,30,40,50,70,80,60,40,20]

print(list(gen_significant_changes(my_list)))

这篇关于检测数据集中的重大变化,然后逐渐变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆