如何向前填充表中的空值 [英] How To Forward-Fill empty values in a table

查看:14
本文介绍了如何向前填充表中的空值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的 Big Query 表:![表[(https://ibb.co/1ZXMH71)如您所见,大多数值都是空的.我想向前填充这些空值,这意味着使用按时间排序的最后一个已知值.

显然,有一个名为 FILL 的函数https://cloud.google.com/dataprep/docs/html/FILL-函数_57344752但我不知道如何使用它.

这是我尝试在 Web UI 上发布的查询:

SELECT sns_6,Time从 TABLE_PATH填写 sns_6,-1,0 订单:时间

我得到的错误是:语法错误:在 [3:6] 出现意外的标识符sns_6"我想要的是获得一个新表,其中 sns_6 列填充了最后一个已知值.

作为奖励:我希望这对所有列都发生,但因为 fill 仅支持单个列,所以现在,我必须遍历所有列.如果有人知道如何进行迭代,这将是一个巨大的奖励.

解决方案

以下为 BigQuery Standard SQL

<块引用>

我想向前填充那些空值,这意味着使用按时间排序的最后一个已知值

#standardSQL选择时间LAST_VALUE(sns_1 IGNORE NULLS) OVER(ORDER BY time) sns_1,LAST_VALUE(sns_2 IGNORE NULLS) OVER(ORDER BY time) sns_2从`project.dataset.table`

<块引用>

我希望所有列都发生这种情况

您可以在下面的行中添加尽可能多的需要填充的列(显然您需要将 sns_N 替换为实际列的名称

 LAST_VALUE(sns_N IGNORE NULLS) OVER(ORDER BY time) sns_N

I have a Big Query table that looks like this: ![Table[(https://ibb.co/1ZXMH71) As you can see most values are empty. I'd like to forward-fill those empty values, meaning using the last known value ordered by time.

Apparently, there is a function for that called FILL https://cloud.google.com/dataprep/docs/html/FILL-Function_57344752 But I have no idea how to use it.

This is the Query I've tried posting on the web UI:

SELECT sns_6,Time
FROM TABLE_PATH
FILL sns_6,-1,0 order: Time

the error I get is: Syntax error: Unexpected identifier "sns_6" at [3:6] What I want is to get a new table where the column sns_6 is filled with the last known value.

As a bonus: I'd like this to happen for all columns but because fill only supports a single column, for now, I'll have to iterate over all the columns. If anyone has an idea of how to do the iteration This would be a great bonus.

解决方案

Below is for BigQuery Standard SQL

I'd like to forward-fill those empty values, meaning using the last known value ordered by time

#standardSQL
SELECT time
  LAST_VALUE(sns_1 IGNORE NULLS) OVER(ORDER BY time) sns_1,
  LAST_VALUE(sns_2 IGNORE NULLS) OVER(ORDER BY time) sns_2
FROM `project.dataset.table`

I'd like this to happen for all columns

You can add as many below lines as many columns you need to fill (obviously you need to replace sns_N with the real column's name

  LAST_VALUE(sns_N IGNORE NULLS) OVER(ORDER BY time) sns_N

这篇关于如何向前填充表中的空值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆