具有动态条件和边界的Deedle移动窗口统计计算 [英] Deedle moving window stats calcuation with a dynamic condition and boundary.atending
问题描述
我正在使用动态移动窗口来计算按日期键排序的系列的简单统计信息.我希望能够在窗口的末端设置边界.例如具有每月移动平均值的时间序列,则每月由
决定(fun d1 d2 -> d1.addMonths(1) <= d2)
但是,序列系列函数
windowWhileInto cond f series
始终以begin作为边界.因此,它总是从下一个n个数据点的第一个数据实例创建n个数据点序列(n由上面的乐趣决定).我想从第n个数据中得到一个n个数据点系列,然后向后看过去.
我还尝试过先使用Series.Rev
来逆转该系列,但我认为虽然不再按逆序排列该系列,但还是认为.
我正在寻找可能的东西吗?
如果您查看遗憾的是,这在这里不太起作用,因为它将创建具有重复键的序列(而Deedle不支持这些键).该块末尾的窗口都将以相同的日期结束,因此您将获得重复的键(它实际上在运行,但是您不能对该系列做太多事情.)
一个丑陋的解决方法是记住最后一个块的结尾,并在结尾开始重复时返回缺失值:
let lastKey = ref None
let r =
ts |> Series.aggregateInto
(WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2)) (fun seg -> seg.Data.LastKey())
(fun ds ->
match lastKey.Value, ds.Data.LastKey() with
| Some lk, clk when lk = clk -> OptionalValue.Missing
| _, clk -> lastKey := Some clk; OptionalValue(ds.Data))
|> Series.dropMissing
编辑:我为此记录了一个GitHub问题
I am using a dynamic moving window to calculation simple stats on a series ordered on the date key. I want to be able to set the boundary at the end of the window. for example a timeseries with monthly moving average, the monthly is decided by a
(fun d1 d2 -> d1.addMonths(1) <= d2)
however the deedle series function
windowWhileInto cond f series
always uses the begin as the boundary. Therefore, it always creates produce a n datapoints series from the first data instance for the next n data points (n is decided by the fun above). i would like to have a n datapoints series from the nth data and look backwards into the past.
I also tried to use Series.Rev
first to reverse the series but deedle think that series although in a reversed order is no longer ordered.
Is what i am looking for possible?
If you look at the list of aggregation functions in the docs, you'll find a function aggregate
that is a generalization of all the windowing & chunking functions and also takes a key selector.
This means that you can do something like this:
ts |> Series.aggregateInto
(WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2)) // Aggregation to perform
(fun seg -> seg.Data.LastKey()) // Key selector (use last)
(fun ds -> OptionalValue(ds.Data)) // Value selector
The function takes 3 parameters including key selector and a function that gets "data segment" (which has the window together with a flag whether it is complete or incomplete - e.g. at the end of windowing).
Sadly, this does not quite work here, because it will create a series with duplicate keys (and those are not supported by Deedle). The windows at the end of the chunk will all end with the same date and so you'll get duplicate keys (it actually runs, but you cannot do much with the series).
An ugly workaround is to remember the last chunk's end and return missing values once the end starts repeating:
let lastKey = ref None
let r =
ts |> Series.aggregateInto
(WindowWhile(fun d1 d2 -> d1.AddMonths(1) >= d2)) (fun seg -> seg.Data.LastKey())
(fun ds ->
match lastKey.Value, ds.Data.LastKey() with
| Some lk, clk when lk = clk -> OptionalValue.Missing
| _, clk -> lastKey := Some clk; OptionalValue(ds.Data))
|> Series.dropMissing
EDIT: I logged a GitHub issue for this.
这篇关于具有动态条件和边界的Deedle移动窗口统计计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!