在BigQuery中对记录进行分组并获取分组记录的标准偏差间隔,从而得到错误的值 [英] Grouping records and getting standard deviation intervals for grouped records in BigQuery, getting wrong value

查看:85
本文介绍了在BigQuery中对记录进行分组并获取分组记录的标准偏差间隔,从而得到错误的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我下面有一个SQL,它能够获取按icao_address,flight_number,flight_date分组的timestamp列的间隔平均值.我正在尝试对标准偏差进行同样的操作,尽管我得到了一个数字,但这是错误的.我得到的标准偏差是14.06(请看下面的图片看),而它应该在1.8左右.

I have a SQL below which is able to get the interval average of timestamp column grouped by icao_address, flight_number, flight_date. I'm trying to do the same for standard deviation and although I get a figure, it is wrong. The standard deviation that I get back is 14.06 (look at image below to see) while it should be around 1.8.

以下是我用于stddev计算的内容.

Below is what I'm using for stddev calculation.

STDDEV_POP(UNIX_SECONDS(timestamp))as standard_deviation

下面是我的SQL

#standardSQL

select DATE(timestamp) as flight_date, safe_divide(timestamp_diff(max(timestamp), min(timestamp),SECOND), (COUNT(DISTINCT(timestamp)) - 1))as avg_interval_message, STDDEV_POP(UNIX_SECONDS(timestamp))as standard_deviation,  
icao_address, flight_number, min(timestamp) as firstrecord, max(timestamp) as lastrecord, count(timestamp) as target_updates
from `ais-data-analysis._analytics._aoi_table`
group by icao_address, flight_number, flight_date
having avg_interval_message is not null and flight_number is not null and icao_address = '4B8E41' 
order by flight_date, avg_interval_message ASC


时间戳列是我要获取的标准偏差,它们之间的间隔是10条记录

The timestamp column is what I'm trying to get the standard deviation of, of the intervals between them, it's 10 records

推荐答案

您可以使用STDDEV_POP(<FLOAT>)计算标准差,如您所见

You can use STDDEV_POP(<FLOAT>) to calculate the standard deviation as you can see here

说明

返回值的总体(偏差)标准偏差.这 返回结果在0到+ Inf之间.

Returns the population (biased) standard deviation of the values. The return result is between 0 and +Inf.

此函数将忽略所有NULL输入.如果忽略所有输入,则此 函数返回NULL.

This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.

如果此函数收到单个非NULL输入,则返回0.

If this function receives a single non-NULL input, it returns 0.

支持的输入类型

FLOAT64

可选条款

这些子句按以下顺序应用:

The clauses are applied in the following order:

OVER:指定一个窗口.请参阅分析函数.本条是 当前与STDDEV_POP()中的所有其他子句不兼容. DISTINCT:表达式的每个不同值仅汇总一次 进入结果.

OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_POP(). DISTINCT: Each distinct value of expression is aggregated only once into the result.

返回数据类型

FLOAT64

我希望对您有帮助

这篇关于在BigQuery中对记录进行分组并获取分组记录的标准偏差间隔,从而得到错误的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆