在Hive中获取sysdate -1 [英] Get the sysdate -1 in Hive

查看:3986
本文介绍了在Hive中获取sysdate -1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法在Hive中获取当前日期-1 意味着昨天的日期总是?
并以这种格式 - 20120805

Is there any way to get the current date -1 in Hive means yesterdays date always? And in this format- 20120805?

我可以像这样运行我的查询来获取数据对于昨天的日期,因为今天是 8月6日 -

I can run my query like this to get the data for yesterday's date as today is Aug 6th-

select * from table1 where dt = '20120805';

但是当我用 date_sub函数

But when I tried doing this way with date_sub function to get the yesterday's date as the below table is partitioned on date(dt) column.

select * from table1 where dt = date_sub(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(),
'yyyyMMdd')) , 1)     limit 10;

它正在寻找所有分区中的数据?为什么?在我的查询中出现错误?

It is looking for the data in all the partitions? Why? Something wrong I am doing in my query?

我如何使评估在子查询中发生,以避免整个表被扫描? p>

How I can make the evaluation happen in a subquery to avoid the whole table scanned?

推荐答案

试试类似:

Try something like:

select * from table1 
where dt >= from_unixtime(unix_timestamp()-1*60*60*24, 'yyyyMMdd');

如果您不介意hive扫描整个表格, from_unixtime 不是确定性的,因此Hive中的查询规划器不会为您优化。对于很多情况(例如日志文件),不指定确定性分区键会导致启动非常大的hadoop作业,因为它将扫描整个表,而不仅仅是具有给定分区键的行。

This works if you don't mind that hive scans the entire table. from_unixtime is not deterministic, so the query planner in Hive won't optimize for you. For many cases (for example log files), not specifying a deterministic partition key can cause a very large hadoop job to start since it will scan the whole table, not just the rows with the given partition key.

如果这对您有影响,您可以使用其他选项启动配置单元

If this matters to you, you can launch hive with an additional option

$ hive -hiveconf date_yesterday=20150331

然后在脚本或配置单元终端中使用

And in the script or hive terminal use

select * from table1
where dt >= ${hiveconf:date_yesterday};

变量的名称并不重要,值也没有关系,您可以在此设置它们以使用unix命令获取先前日期。在OP的特定情况下

The name of the variable doesn't matter, nor does the value, you can set them in this case to get the prior date using unix commands. In the specific case of the OP

$ hive -hiveconf date_yesterday=$(date --date yesterday "+%Y%m%d")

这篇关于在Hive中获取sysdate -1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆