算法从数据库中计算最稳定的,连续的值 [英] Algorithm for calculating most stable, consecutive values from a database

查看:196
本文介绍了算法从数据库中计算最稳定的,连续的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些问题,我需要你输入的。

I have some questions and I'm in need of your input.

说我有充满了2000-3000行的数据库表中,每行都有一个值和一些标识。我需要退出连续〜100行最稳定值(最低US $ p $垫)的。这没关系与一些跳线值,如果你能排除它们。 你将如何做到这一点,你会用哪种算法?

Say I have a database table filled with 2000-3000 rows and each row has a value and some identifiers. I am in need of withdrawing ~100 consecutive rows with the most stable values (lowest spread). It's okay with a few jumper values if you can exclude them. How would you do this and what algorithm would you use?

我目前使用SAS企业指南我的数据库运行在甲骨文。我真的不知道,大部分的通用SAS的语言,但我不知道还有什么其他的语言,我可以用这个?一些脚本语言?我有限的编程知识,但这个任务似乎pretty的容易,正确?

I'm currently using SAS Enterprise Guide for my DB which runs on Oracle. I don't really know that much of the generic SAS language but I don't know what other language I could use for this? Some scripting language? I have limited programming knowledge but this task seems pretty easy, correct?

的算法我一直在想的是:

The algorithms I've been thinking of is:

  1. 选择100个连续行和计算标准差。加1 select语句,再计算出标准偏差。环槽整个表。 导出行,最低的标准偏差

  1. Select 100 consecutive rows and calculate standard deviation. Increment select statement by 1 and calculate standard deviation again. Loop trough the whole table. Export the rows with the lowest standard deviation

1相同,但计算方差,而不是标准差(基本上是一回事)。当整个表被循环,再做一次,但不包括1行已经从平均的最高值。重复过程,直到5跳线已被排除,并比较结果。 利弊相比较的方法1?

Same as 1, but calculate variance instead of standard deviation (basically the same thing). When the whole table has been looped, do it again but exclude 1 row which has the highest value from avg. Repeat process until 5 jumpers has been excluded and compare the results. Pros and cons compared to method 1?

问题:

  • 在最佳和放大器;最简单的方法是什么?
  • prefered语言?在SAS可能?
  • 请您有任何其他的方法,你会建议?

在此先感谢

/尼克拉斯·

推荐答案

以下code会做你的要求。它只是用一些示例数据,仅Calcs(计算)它10个观察值(而不是100)。我将它留给你去适应的要求。

The below code will do what you are asking. It is just using some sample data and only calcs it for 10 observations (rather than 100). I'll leave it to you to adapt as required.

创建一些示例数据。适用于所有的SAS安装:

Create some sample data. available to all sas installations:

data xx;
  set sashelp.stocks;
  where stock = 'IBM';
  obs = _n_;
run;

创建行号和排序下降。使得它更容易calc下STDDEV:

Create row numbers and sort it descending. Makes it easier to calc stddev:

proc sort data=xx;
  by descending obs;
run;

使用数组保持后续的10 OBS的每一行。计算STDDEV为使用阵列(除了最后10行的每一行。记住我们正在向后通过数据

Use an array to keep the subsequent 10 obs for every row. Calculate the stddev for each row using the array (except for the last 10 rows. Remember we are working backwards through the data.

data calcs;
  set xx;

  array a[10] arr1-arr10;

  retain arr1-arr10 .;

  do tmp=10 to 2 by -1;
    a[tmp] = a[tmp-1];
  end;
  a[1] = close;

  if _n_ ge 10 then do;
    std = std(of arr1-arr10);
  end;

run;

查找哪些OB(即行)的最低STDDEV计算。它保存到一个宏变种。

Find which obs (ie. row) had the lowest stddev calc. Save it to a macro var.

proc sql noprint;
  select obs into :start_row
  from calcs
  having std = min(std)
  ;
quit;

选择从参与calcing最低STDDEV,样本数据的10个观察值。

Select the 10 observations from the sample data that were involved in calcing the lowest stddev.

proc sql noprint;
  create table final as
  select *
  from xx
  where obs between &start_row and %eval(&start_row+10)
  order by obs
  ;
quit;

这篇关于算法从数据库中计算最稳定的,连续的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆