Presto/Athena 中嵌套日期分区的比较查询 [英] Comparison query on nested date partition in Presto/Athena

查看:33
本文介绍了Presto/Athena 中嵌套日期分区的比较查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 S3 上存储了镶木地板数据,以 Hive 理解的格式进行分区

I have parquet data stored on S3, partitioned in the format that Hive understands

s3://<base_path>/year=2019/month=11/day=08/files.pq

表架构还将年、月、日指定为分区字段.

The table schema also specifies year, month, day as partition fields.

是否可以将查询,特别是日期上的 LIKE, IN, BETWEEN 与这种数据组织进行比较?AWS 雅典娜最佳实践 博客似乎暗示了它的可能(SELECT count(*) FROM lineitem WHERE l_shipdate >='1996-09-01' AND l_shipdate < '1996-10-01'),但我无法弄清楚如何在表创建期间或查询期间指定复合字段(查询中的 l_shipdate).

Is it possible to comparison queries, specifically LIKE, IN, BETWEEN on dates, with this organization of data? An AWS Athena best practices blog seems to suggest its possible (SELECT count(*) FROM lineitem WHERE l_shipdate >= '1996-09-01' AND l_shipdate < '1996-10-01'), but I could not figure out how to specify the composite field (l_shipdate in query), either during table creation or during query.

推荐答案

是的,有可能,但看起来不太优雅)

Yes, it is possible, but it doesn’t look very elegant)

SELECT col1, col2
FROM my_table 
WHERE CAST(date_parse(concat(CAST(year AS VARCHAR(4)),'-',
                             CAST(month AS VARCHAR(2)),'-',
                             CAST(day AS VARCHAR(2))
                             ), '%Y-%m-%d') as DATE) 
BETWEEN DATE '2018-01-01' AND DATE '2018-01-31'

这篇关于Presto/Athena 中嵌套日期分区的比较查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆