如何为数据流挖掘创建滑动窗口模型? [英] how to make Sliding window model for data stream mining?
问题描述
我们有一种情况,即流(来自传感器的数据或服务器上的单击流数据)与滑动窗口算法一起出现,我们必须将最后(例如)500个数据样本存储在内存中.然后将这些样本用于创建直方图,聚合和捕获有关输入数据流中异常的信息.
请告诉我如何制作这种滑动窗口.
如果您要问如何以滑动窗口的方式存储和维护这些值,请考虑以下简单示例,该示例跟踪最近10次的运行均值一些随机数据流的值:
WINDOW_SIZE = 10;
x = nan(WINDOW_SIZE,1);
%# init
counter = 0;
stats = [NaN NaN]; %# previous/current value
%# prepare figure
SHOW_LIM = 200;
hAx = axes('XLim',[1 SHOW_LIM], 'YLim',[200 800]);
hLine = line('XData',1, 'YData',nan, 'EraseMode','none', ...
'Parent',hAx, 'Color','b', 'LineWidth',2);
%# infinite loop!
while true
val = randi([1 1000]); %# get new value from data stream
x = [ x(2:end) ; val ]; %# add to window in a cyclic manner
counter = counter + 1;
%# do something interesting with x
stats(1) = stats(2); %# keep track of the previous mean
stats(2) = nanmean(x); %# update the current mean
%# show and update plot
set(hLine, 'XData',[counter-1 counter], 'YData',[stats(1) stats(2)])
if rem(counter,SHOW_LIM)==0
%# show only the last couple of means
set(hAx, 'XLim', [counter counter+SHOW_LIM]);
end
drawnow
pause(0.02)
if ~ishandle(hAx), break, end %# break in case you close the figure
end
更新
animatedline
函数代替类似的功能.>
we have a situation that a stream (data from sensor or click stream data at server) is coming with sliding window algorithm we have to store the last (say) 500 samples of data in memory. These samples are then used to create histograms, aggregations & capture information about anomalies in the input data stream.
please tell me how to make such sliding window.
If you are asking how to store and maintain these values in a sliding-window manner, consider this simple example which keep tracks of the running mean of the last 10 values of some random stream of data:
WINDOW_SIZE = 10;
x = nan(WINDOW_SIZE,1);
%# init
counter = 0;
stats = [NaN NaN]; %# previous/current value
%# prepare figure
SHOW_LIM = 200;
hAx = axes('XLim',[1 SHOW_LIM], 'YLim',[200 800]);
hLine = line('XData',1, 'YData',nan, 'EraseMode','none', ...
'Parent',hAx, 'Color','b', 'LineWidth',2);
%# infinite loop!
while true
val = randi([1 1000]); %# get new value from data stream
x = [ x(2:end) ; val ]; %# add to window in a cyclic manner
counter = counter + 1;
%# do something interesting with x
stats(1) = stats(2); %# keep track of the previous mean
stats(2) = nanmean(x); %# update the current mean
%# show and update plot
set(hLine, 'XData',[counter-1 counter], 'YData',[stats(1) stats(2)])
if rem(counter,SHOW_LIM)==0
%# show only the last couple of means
set(hAx, 'XLim', [counter counter+SHOW_LIM]);
end
drawnow
pause(0.02)
if ~ishandle(hAx), break, end %# break in case you close the figure
end
Update
The EraseMode=none
property was deprecated and removed in recent versions. Use the animatedline
function instead for a similar functionality.
这篇关于如何为数据流挖掘创建滑动窗口模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!