C中的滚动中值算法 [英] Rolling median algorithm in C

查看:20
本文介绍了C中的滚动中值算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在研究一种用 C 语言实现滚动中值滤波器(类似于滚动均值滤波器)的算法.从我对文献的搜索来看,似乎有两种相当有效的方法可以做到这一点.第一种是对初始值窗口进行排序,然后执行二分搜索以插入新值并在每次迭代时删除现有值.

I am currently working on an algorithm to implement a rolling median filter (analogous to a rolling mean filter) in C. From my search of the literature, there appear to be two reasonably efficient ways to do it. The first is to sort the initial window of values, then perform a binary search to insert the new value and remove the existing one at each iteration.

第二个(来自 Hardle 和 Steiger,1995,JRSS-C,算法 296)构建了一个双端堆结构,一端是 maxheap,另一端是 minheap,中间是中值.这产生了一种线性时间算法,而不是 O(n log n).

The second (from Hardle and Steiger, 1995, JRSS-C, Algorithm 296) builds a double-ended heap structure, with a maxheap on one end, a minheap on the other, and the median in the middle. This yields a linear-time algorithm instead of one that is O(n log n).

这是我的问题:实现前者是可行的,但我需要在数百万个时间序列上运行它,所以效率很重要.后者被证明非常难以实施.我在R的stats包的代码的Trunmed.c文件中找到了代码,但是比较难辨认.

Here is my problem: implementing the former is doable, but I need to run this on millions of time series, so efficiency matters a lot. The latter is proving very difficult to implement. I found code in the Trunmed.c file of the code for the stats package of R, but it is rather indecipherable.

有谁知道线性时间滚动中值算法的编写良好的 C 实现?

Does anyone know of a well-written C implementation for the linear time rolling median algorithm?

Trunmed.c 代码链接http://google.com/codesearch/p?hl=en&sa=N&cd=1&ct=rc#mYw3h_Lb_e0/R-2.2.0/src/library/stats/src/Trunmed.c

Link to Trunmed.c code http://google.com/codesearch/p?hl=en&sa=N&cd=1&ct=rc#mYw3h_Lb_e0/R-2.2.0/src/library/stats/src/Trunmed.c

推荐答案

我已经看了几次 R 的 src/library/stats/src/Trunmed.c 因为我也想要类似的东西一个独立的 C++ 类/C 子程序.请注意,这实际上是两个实现合二为一,请参阅 src/library/stats/man/runmed.Rd(帮助文件的来源)

I have looked at R's src/library/stats/src/Trunmed.c a few times as I wanted something similar too in a standalone C++ class / C subroutine. Note that this are actually two implementations in one, see src/library/stats/man/runmed.Rd (the source of the help file) which says

details{
  Apart from the end values, the result code{y = runmed(x, k)} simply has
  code{y[j] = median(x[(j-k2):(j+k2)])} (k = 2*k2+1), computed very
  efficiently.

  The two algorithms are internally entirely different:
  describe{
    item{"Turlach"}{is the Härdle-Steiger
      algorithm (see Ref.) as implemented by Berwin Turlach.
      A tree algorithm is used, ensuring performance eqn{O(n log
        k)}{O(n * log(k))} where code{n <- length(x)} which is
      asymptotically optimal.}
    item{"Stuetzle"}{is the (older) Stuetzle-Friedman implementation
      which makes use of median emph{updating} when one observation
      enters and one leaves the smoothing window.  While this performs as
      eqn{O(n 	imes k)}{O(n * k)} which is slower asymptotically, it is
      considerably faster for small eqn{k} or eqn{n}.}
  }
}

很高兴看到以更独立的方式重新使用它.你是自愿的吗?我可以帮忙处理一些 R 位.

It would be nice to see this re-used in a more standalone fashion. Are you volunteering? I can help with some of the R bits.

编辑 1:除了上面指向旧版 Trunmed.c 的链接外,这里是

Edit 1: Besides the link to the older version of Trunmed.c above, here are current SVN copies of

编辑 2:Ryan Tibshirani 在 快速中值分箱,这可能是窗口方法的合适起点.

Edit 2: Ryan Tibshirani has some C and Fortran code on fast median binning which may be a suitable starting point for a windowed approach.

这篇关于C中的滚动中值算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆