如何找到异常用法 [英] How to find anomalous usage

查看:80
本文介绍了如何找到异常用法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用包含公用事业仪表用法。我们必须处理的事情之一

是用法明显不正确的时候。也许有人在输入

读数等时写下电表

读数不正确或误差为10倍。其他时候使用率为零或以某种方式输入为一个

的负数。


所以我正在考虑添加搜索此类异常的功能。对于

实例,显示抄表比前12个月的平均值b / b $ b $高25%的月份。或者显示特定仪表的月份,其中
相邻每月使用量相差20%。这是一个数据示例


Meter 5678

Jan-06 100

Feb-06 105

Mar-06 75

06年6月90

06年5月101

Jun-06 900

Jul-06 89

所以你可以从这些数据中看出900显然不正确,可能

应该是90.在06年3月使用75出现在搜索上的地方

是相邻月份之间25%或更多的差异。我们可能还会给b $ b代码搜索零使用和负面使用的功能。


请记住,我们有几千米左右10万美元每月每小时的使用费用超过数年。


我正在寻找实现此功能的方法。通过表格搜索行

可能需要很长时间。是否有一个聪明的方法来通过单独的SQL或主要通过SQL来处理这个问题?或者

有没有人有任何其他建议?看起来这可能是一个非常缓慢的过程。

谢谢。


-

通过AccessMonster.com发布的消息
http://www.accessmonster.com/Uwe/For...ccess/200610/1

My app contains utility meter usage. One of the things we have to deal with
is when a usage is clearly incorrect. Perhaps someone wrote the meter
reading down incorrectly or made a factor of 10 error when entering the
reading, etc. At other times the usage is zero or somehow was entered as a
negative number.

So I''m thinking about adding functionality to search for such anomalies. For
instance, show months where the meter reading is 25% higher than the average
for the prior 12 months. Or show months for a particular meter where there
is a difference of 20% between adjacent monthly usage. Here''s a data example

Meter 5678

Jan-06 100
Feb-06 105
Mar-06 75
Apr-06 90
May-06 101
Jun-06 900
Jul-06 89
So you can see from this data that 900 is clearly incorrect and probably
should be 90. The 75 usage in Mar-06 would show up on a search where there
is a difference between adjacent months of 25% or more. We''ll probably also
code the functionality to search for zero usage and negative usage.

Bear in mind that we have several thousand meters and around a 100,000
monthly meter usages spanning several years.

I''m looking for an approach to implement this functionality. Searching row
by row through the tables would probably take a very long time. Is there a
clever way to handle this through SQL alone or mostly through SQL? Or does
anyone have any other suggestions? It would seem that this could be a very
slow process.
Thanks.

--
Message posted via AccessMonster.com
http://www.accessmonster.com/Uwe/For...ccess/200610/1

推荐答案

2006年10月4日星期三格林尼治标准时间14:23:13,rdemyan来自AccessMonster.com

< u6836 @ uwewrote:


我会将读数与常见的

趋势的缩放版本进行比较。这个趋势将是所有米的平均值,显示

例如冬季月份的使用量高于夏季

个月。缩放是为了说明一个较大的房屋比较小的房屋设置更高的房价。


我不会担心速度,直到它被证明成为一个问题。


-Tom。

On Wed, 04 Oct 2006 14:23:13 GMT, "rdemyan via AccessMonster.com"
<u6836@uwewrote:

I would compare the readings against a scaled version of the common
trend. The trend would be an average over all meters, showing for
example that the usage in winter months is higher than in summer
months. The scaling is to account for a larger home putting up higher
numbers than a smaller one.

I would not worry about speed until it''s proven to be an issue.

-Tom.


>我的应用程序包含公用事业仪表用法。我们必须处理的事情之一是当使用明显不正确时。也许有人在进入
读数时写了不正确的读数表或者误差为10倍。其他时候使用量为零或以某种方式输入为负数。

所以我正在考虑添加搜索此类异常的功能。例如,显示抄表比前12个月的平均值高出25%的月份。或者显示特定仪表的月数,其中相邻的每月使用量之间相差20%。这是一个数据示例

Meter 5678
1月06 100
2006年2月105
Mar-06 75
4月 - 06 90
2006年6月101
2006年6月900
89

因此,您可以从这些数据中看出900显然是不正确的,可能是
应该是90.在06年3月的75次使用会出现在搜索中,其中相邻月份之间的差异为25%或更多。我们可能还会对搜索零使用和负面使用的功能进行编码。

请记住,我们有几千米,每月约有100,000米的使用量跨越几年。

我正在寻找实现此功能的方法。在表格中逐行搜索行可能需要很长时间。是否有一种聪明的方法可以通过单独的SQL或主要通过SQL来处理这个问题?或者是否有人有任何其他建议?看起来这可能是一个非常缓慢的过程。

谢谢。
>My app contains utility meter usage. One of the things we have to deal with
is when a usage is clearly incorrect. Perhaps someone wrote the meter
reading down incorrectly or made a factor of 10 error when entering the
reading, etc. At other times the usage is zero or somehow was entered as a
negative number.

So I''m thinking about adding functionality to search for such anomalies. For
instance, show months where the meter reading is 25% higher than the average
for the prior 12 months. Or show months for a particular meter where there
is a difference of 20% between adjacent monthly usage. Here''s a data example

Meter 5678

Jan-06 100
Feb-06 105
Mar-06 75
Apr-06 90
May-06 101
Jun-06 900
Jul-06 89
So you can see from this data that 900 is clearly incorrect and probably
should be 90. The 75 usage in Mar-06 would show up on a search where there
is a difference between adjacent months of 25% or more. We''ll probably also
code the functionality to search for zero usage and negative usage.

Bear in mind that we have several thousand meters and around a 100,000
monthly meter usages spanning several years.

I''m looking for an approach to implement this functionality. Searching row
by row through the tables would probably take a very long time. Is there a
clever way to handle this through SQL alone or mostly through SQL? Or does
anyone have any other suggestions? It would seem that this could be a very
slow process.
Thanks.


" rdemyan via AccessMonster.com" < u6836 @ uwewrote

news:6743ad47a3f7b @ uwe:
"rdemyan via AccessMonster.com" <u6836@uwewrote in
news:6743ad47a3f7b@uwe:

我的应用程序包含公用事业仪表用法。

处理的事情之一就是当使用明显不正确时。也许有人写了

电表读数不正确或者当

输入读数等时误差为10倍。其他时候使用量为零或

以某种方式输入为负数。


所以我正在考虑添加搜索此类

异常的功能。例如,显示抄表数比前12个月平均值高出25%的月份。或者显示特定米的

的月份,相邻的每月使用
之间相差20%。这是一个数据示例


Meter 5678

Jan-06 100

Feb-06 105

Mar-06 75

06年6月90

06年5月101

Jun-06 900

Jul-06 89


所以你可以从这些数据中看出900显然是不正确的并且

可能应该是90.使用75 06年3月将出现在一个

搜索中,相邻月份之间的差异为25%或

更多。我们可能还会编写功能来搜索零

的使用情况和负面使用情况。


请记住,我们有几千米左右每月10万美元b / b
用量超过数年。


我正在寻找实现此功能的方法。

搜索逐行通过表格可能需要很长时间才能获得
。有没有一种聪明的方法可以通过SQL单独处理这个问题或者主要通过SQL来处理
?或者有没有人有任何其他建议?这可能是一个非常缓慢的过程。


谢谢。
My app contains utility meter usage. One of the things we have to
deal with is when a usage is clearly incorrect. Perhaps someone wrote
the meter reading down incorrectly or made a factor of 10 error when
entering the reading, etc. At other times the usage is zero or
somehow was entered as a negative number.

So I''m thinking about adding functionality to search for such
anomalies. For instance, show months where the meter reading is 25%
higher than the average for the prior 12 months. Or show months for
a particular meter where there is a difference of 20% between adjacent
monthly usage. Here''s a data example

Meter 5678

Jan-06 100
Feb-06 105
Mar-06 75
Apr-06 90
May-06 101
Jun-06 900
Jul-06 89
So you can see from this data that 900 is clearly incorrect and
probably should be 90. The 75 usage in Mar-06 would show up on a
search where there is a difference between adjacent months of 25% or
more. We''ll probably also code the functionality to search for zero
usage and negative usage.

Bear in mind that we have several thousand meters and around a 100,000
monthly meter usages spanning several years.

I''m looking for an approach to implement this functionality.
Searching row by row through the tables would probably take a very
long time. Is there a clever way to handle this through SQL alone or
mostly through SQL? Or does anyone have any other suggestions? It
would seem that this could be a very slow process.
Thanks.



OTTOMH


SELECT m.Reading,(m.Reading-sq.Average)/sq.StDev AS ZScore FROM米m

LEFT JOIN

[选择平均(Meter.Reading)AS平均值,StDev(Meter.Reading)AS StDev

FROM Meter] 。 sq

ON m.Reading * 1000< sq.Average

WHERE((m.Reading-sq.Average)/sq.StDev)> = 2

ORDER BY(m.Reading-sq.Average)/sq.StDev


当然,你必须根据自己的情况修改它。我有

建议得分> = 2会怀疑但你自己的经验

将是这里最好的指南。


不,我真的不希望你能用这个,但希望

永恒。


-

Lyle Fairfield

OTTOMH

SELECT m.Reading, (m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m
LEFT JOIN
[SELECT Avg(Meter.Reading) AS Average, StDev(Meter.Reading) AS StDev
FROM Meter]. sq
ON m.Reading*1000 <sq.Average
WHERE ((m.Reading-sq.Average)/sq.StDev)>=2
ORDER BY (m.Reading-sq.Average)/sq.StDev

You, of course, would have to modify this for your own situation. I have
suggested that a Score >= 2 would be suspect but your own experience
would be the best guide here.

No, I don''t really expect that you will be able to use this, but hope
springs eternal.

--
Lyle Fairfield


有趣,Lyle。我会看到我能做些什么并报告回来。你

显示2,但用户可以在表格上轻松更改(但是,我会

必须考虑一下这对我们来说真正意味着什么凡人可以

了解)。


我想要添加的一个明确的事情是能够选择特定的

时间范围。


Lyle Fairfield写道:
Interesting, Lyle. I''ll see what I can do with this and report back. You
show 2 but that can be easily changed by the user on the form (however, I''ll
have to think about what that really means in terms us mere mortals can
understand).

One definate thing I will want to add is the ability to select a specific
time frame.

Lyle Fairfield wrote:

>我的应用程序包含公用事业计量表用法。我们必须处理的一件事是当使用明显不正确时。也许有人写了
>My app contains utility meter usage. One of the things we have to
deal with is when a usage is clearly incorrect. Perhaps someone wrote


[引用文字剪辑 - 34行]

[quoted text clipped - 34 lines]


>>
谢谢。
>>
Thanks.


OTTOMH

SELECT m.Reading,(m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m
LEFT JOIN
[选择平均值(Meter.Reading)AS平均值,StDev(Meter.Reading)AS StDev
FROM Meter]。 sq
ON m.Reading * 1000< sq.Average
WHERE((m.Reading-sq.Average)/sq.StDev)> = 2
ORDER BY(m.Reading -sq.Average)/sq.StDev

当然,你必须根据自己的情况修改它。我已经建议得分> = 2会怀疑,但你自己的经验
将是这里最好的指南。

不,我真的没想到你将能够使用它,但希望
是永恒的。


OTTOMH

SELECT m.Reading, (m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m
LEFT JOIN
[SELECT Avg(Meter.Reading) AS Average, StDev(Meter.Reading) AS StDev
FROM Meter]. sq
ON m.Reading*1000 <sq.Average
WHERE ((m.Reading-sq.Average)/sq.StDev)>=2
ORDER BY (m.Reading-sq.Average)/sq.StDev

You, of course, would have to modify this for your own situation. I have
suggested that a Score >= 2 would be suspect but your own experience
would be the best guide here.

No, I don''t really expect that you will be able to use this, but hope
springs eternal.



-

通过AccessMonster.com发布的消息
http:// www .accessmonster.com / Uwe / For ... ccess / 200610/1

--
Message posted via AccessMonster.com
http://www.accessmonster.com/Uwe/For...ccess/200610/1


这篇关于如何找到异常用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆