在HiveQL中创建和求和变量的值 [英] Creating and sum the values of variables in HiveQL

查看:228
本文介绍了在HiveQL中创建和求和变量的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的一个用例中,我试图将传统的基于SQL的仓库应用程序迁移到Hive中。我既不是SQL专家,也不是HiveQL专家。但是,我在创建表格和将csv导入配置表格中取得了不错的成绩。但随后我在Hive QL中复制基于SQL的分析查询时遇到了障碍。在其中一种分析算法中,我们需要创建多个变量并计算这些变量的总和,如下所示:

  var x = query1 
var y = query2
var z = query3

var count = x + y + z

我很努力在HiveQL中完成这项工作。

实际上,所有这些变量形成一列,而另外一列是总数,它携带了var1,var2和var3的总和。有没有像sum这样的聚合函数可以用来在特定行的列中添加值,即var1,var2和var3,并将它们存储在该行的列总数中?



有可能做下面的事情。在表TEMP本身创建期间不可能告诉Hive?

 插入表TEMP值(1,2,1 2); 

如果上述不可行,我尝试了下面的解决方法

  DROP TABLE T_TEMP; 
DROP TABLE VARI;

CREATE TABLE VARI(value INT);
插入表VARI值(1);
插入表格VARI值(2);
CREATE TABLE T_TEMP(var1 INT,var2 INT,total INT);
INSERT INTO TABLE T_TEMP
SELECT
1,
2,
sum(t.value)
from VARI as t;

我使用的是Apache Hue,并且倾向于在Hue编辑器中编写Hue编辑器中的所有查询和语句,并最终启动查询。

是否有其他可用的干净方法,除了:a)写入temp列,b)直接在INSERT中添加值。我更喜欢变量x,y,z并且能够将它们相加count = x + y + z。

解决方案

  select sum(yx)from(
select count(*)作为x从table1
union all
从table2中选择count(*)作为x
union all
从table3中选择count(*)作为x)y


In one of my use cases, I am trying to migrate traditional SQL based warehousing application into Hive. Neither I am an expert in SQL not in HiveQL. However, I have had decent success in creating tables and importing csv into hive tables. But then I hit a road block while replicating SQL based analytics query in Hive QL. In one of the analytics algorithms, we need to create multiple variables and calculate the sum of those variables, like below

var x = query1
var y = query2
var z = query3

var count = x + y + z

I am struggling to get this done in HiveQL.

In fact, all these variables form a column, and one more column is total that carries the sum of var1, var2, and var3. Is there any aggregate function something like sum that i can use to add values in the column of a particular row namely var1, var2, and var3, and store them in column total for that row?

Is it possible to do something like below. Is it not possible to tell Hive during the creation of table TEMP itself?

INSERT INTO TABLE TEMP VALUES (1, 2, 1+2);

If the above is not possible, I tried the below workaround

DROP TABLE T_TEMP;
DROP TABLE VARI;

CREATE TABLE VARI(value INT);
INSERT INTO TABLE VARI VALUES(1);
INSERT INTO TABLE VARI VALUES(2);
CREATE TABLE T_TEMP(var1 INT, var2 INT, total INT);      
INSERT INTO TABLE T_TEMP  
   SELECT 
     1,
     2,
     sum(t.value)
     from VARI as t;

I am using Apache Hue and prefer to write all the queries and statements in Hive editor that Hue proviedes, and fire the query in the end.

Is there any other clean approaches available, other than, a)writing into temp column, b) adding the values in INSERT directly. I prefer to have variables x, y, z and be able to sum them count = x + y + z.

解决方案

select sum(y.x) from (
select count(*) as x from table1
union all
select count(*) as x from table2
union all
select count(*) as x from table3) y

这篇关于在HiveQL中创建和求和变量的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆