具有动态条件和边界的Deedle移动窗口统计计算 [英] Deedle moving window stats calcuation with a dynamic condition and boundary.atending

查看:105
本文介绍了具有动态条件和边界的Deedle移动窗口统计计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用动态移动窗口来计算按日期键排序的系列的简单统计信息.我希望能够在窗口的末端设置边界.例如具有每月移动平均值的时间序列,则每月由

决定

(fun d1 d2 -> d1.addMonths(1) <= d2)

但是,序列系列函数

windowWhileInto cond f series

始终以begin作为边界.因此,它总是从下一个n个数据点的第一个数据实例创建n个数据点序列(n由上面的乐趣决定).我想从第n个数据中得到一个n个数据点系列,然后向后看过去.

我还尝试过先使用Series.Rev来逆转该系列,但我认为虽然不再按逆序排列该系列,但还是认为.

我正在寻找可能的东西吗?

解决方案

如果您查看遗憾的是,这在这里不太起作用,因为它将创建具有重复键的序列(而Deedle不支持这些键).该块末尾的窗口都将以相同的日期结束,因此您将获得重复的键(它实际上在运行,但是您不能对该系列做太多事情.)

一个丑陋的解决方法是记住最后一个块的结尾,并在结尾开始重复时返回缺失值:

let lastKey = ref None
let r = 
  ts |> Series.aggregateInto
      (WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2)) (fun seg -> seg.Data.LastKey())
      (fun ds -> 
         match lastKey.Value, ds.Data.LastKey() with 
         | Some lk, clk when lk = clk -> OptionalValue.Missing
         | _, clk -> lastKey := Some clk; OptionalValue(ds.Data))
     |> Series.dropMissing

编辑:我为此记录了一个GitHub问题

I am using a dynamic moving window to calculation simple stats on a series ordered on the date key. I want to be able to set the boundary at the end of the window. for example a timeseries with monthly moving average, the monthly is decided by a

(fun d1 d2 -> d1.addMonths(1) <= d2)

however the deedle series function

windowWhileInto cond f series

always uses the begin as the boundary. Therefore, it always creates produce a n datapoints series from the first data instance for the next n data points (n is decided by the fun above). i would like to have a n datapoints series from the nth data and look backwards into the past.

I also tried to use Series.Rev first to reverse the series but deedle think that series although in a reversed order is no longer ordered.

Is what i am looking for possible?

解决方案

If you look at the list of aggregation functions in the docs, you'll find a function aggregate that is a generalization of all the windowing & chunking functions and also takes a key selector.

This means that you can do something like this:

ts |> Series.aggregateInto
        (WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2))  // Aggregation to perform
        (fun seg -> seg.Data.LastKey())                    // Key selector (use last)
        (fun ds -> OptionalValue(ds.Data))                 // Value selector

The function takes 3 parameters including key selector and a function that gets "data segment" (which has the window together with a flag whether it is complete or incomplete - e.g. at the end of windowing).

Sadly, this does not quite work here, because it will create a series with duplicate keys (and those are not supported by Deedle). The windows at the end of the chunk will all end with the same date and so you'll get duplicate keys (it actually runs, but you cannot do much with the series).

An ugly workaround is to remember the last chunk's end and return missing values once the end starts repeating:

let lastKey = ref None
let r = 
  ts |> Series.aggregateInto
      (WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2)) (fun seg -> seg.Data.LastKey())
      (fun ds -> 
         match lastKey.Value, ds.Data.LastKey() with 
         | Some lk, clk when lk = clk -> OptionalValue.Missing
         | _, clk -> lastKey := Some clk; OptionalValue(ds.Data))
     |> Series.dropMissing

EDIT: I logged a GitHub issue for this.

这篇关于具有动态条件和边界的Deedle移动窗口统计计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆