如何在 hive 中创建一个空的结构数组? [英] How to create an empty array of struct in hive?

查看:269
本文介绍了如何在 hive 中创建一个空的结构数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Hive 1.1.0中有一个视图,根据条件,它应该返回一个空数组或struct的数组

I have a view in Hive 1.1.0, based on a condition, it should return an empty array or an array of struct<name: string, jobslots: int>

这是我的代码:

select
      case when <condition> 
             then array()
           else array(struct(t1.name, t1.jobslots))
       end
from table t1;

这里的问题是,空数组 array() 的类型是 array.因此,当我尝试将其插入表中时,它会引发错误.

The problem here is, that the empty array array() is of type array<string>. So when I try to insert it into a table, it throws an error.

如何更改它以返回 array<struct<name: string, jobslots:int>> 类型的空数组,以便 Hive 的 size() 函数在这个数组上返回 0?

How can I change this to return an empty array of type array<struct<name: string, jobslots:int>> so that Hive's size() function returns 0 on this array?

推荐答案

您可以使用 collect_listcolect_set 来收集从连接中获取的结构数组,并且连接条件为假,则 collect_list 将产生一个空的结构数组.

You can do it using collect_list or colect_set for collecting an array of structs taken from a join, and a join condition is false, then collect_list will produce an empty array of struct.

此查询返回大小为 0 的数组:

This query returns array of 0 size:

select a.id, size(collect_list(b.str))=0 array_size_zero
from
(select 2 id ) a
  left join (select 1 as id, named_struct('name',null,'value',null) as str) b
            on a.id=b.id
group by a.id

结果:

a.id    array_size_zero
2       true

如果在第一个子查询 a 中更改 id 以与 b 连接,它将返回具有 1 个元素的数组.而且这些结果是同类型的,你可以使用union all轻松查看.

If you change id in the first subquery a to join with b, it will return array with 1 element. And these results are of the same type, you can check it easily using union all.

检查结果类型相同:

select a.id, collect_list(b.str) my_array
from
(select 1 id ) a
  left join (select 1 as id, named_struct('name',null,'value',null) as str) b
            on a.id=b.id
group by a.id

union all

select a.id, collect_list(b.str) my_array
from
(select 2 id ) a
  left join (select 1 as id, named_struct('name',null,'value',null) as str) b
            on a.id=b.id
group by a.id  

结果:

id  my_array
1   [{"name":null,"value":null}]
2   []

如果我尝试联合所有不同类型的空结构数组,例如 array() 会发生什么:

What happens if I try to UNION ALL empty array of struct with different type, for example array():

select 1 id, array() my_array

union all

select a.id, collect_list(b.str) my_array
from
(select 2 id ) a
  left join (select 1 as id, named_struct('name',null,'value',null) as str) b
            on a.id=b.id
group by a.id

异常:

编译语句时出错:FAILED: SemanticException Schema of联合的双方应该匹配:列 my_array 是类型第一个表上的数组和类型数组在第二张桌子上.不能告诉空 AST 的位置.

Error while compiling statement: FAILED: SemanticException Schema of both sides of union should match: Column my_array is of type array on first table and type array<structname:void,value:void> on second table. Cannot tell the position of null AST.

这表明第一个查询确实返回空的结构数组.您可以轻松地在查询中执行类似的联接.

This demonstrates that first query really returns empty array of struct. You can easily do similar join in your query.

如何在带条件的查询中使用它?演示:

How can you use it in your query with condition? Demo:

select a.id, case when true --Put your condition here instead of dummy <true> 
                     then collect_list(a.str) --not empty
                  else collect_list(b.str) --this one is empty array of the same type
              end as my_array
from
(select 2 id, named_struct('name',null,'value',null) str) a
  left join (select 1 as id, named_struct('name',null,'value',null) as str) b
            on FALSE
group by a.id

CASE 表达式很愉快,不会引发不兼容类型的异常

CASE expression is quite happy and does not raise exception about incompatible types

这篇关于如何在 hive 中创建一个空的结构数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆