我们能否提取通过元数据在蜂巢上运行的查询 [英] Can we extract the queries which was ran on hive through metadata

查看:60
本文介绍了我们能否提取通过元数据在蜂巢上运行的查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上,我希望一天之内在配置单元上运行的查询的元数据.我研究了Hive在MySql中提供的元数据.但是找不到任何存储查询相关信息的表.

Basically, I want metadata of queries that ran on the hive in one day. I looked into the metadata which is provided by hive in MySql. But not able to find any table which stores query related information.

推荐答案

进行了一些研究后发现,我们可以使用用于Hadoop的History Server REST API提取MapReduce作业.

After doing some research found that we can extract the MapReduce jobs using the History Server REST API for Hadoop.

然后,您将获得与Job相关的信息.

And From that you'll the the Job related information.

要获取查询,您需要请求特定作业的配置文件

To get the query you need request for particular job's conf

< history_server_http_address:port>/ws/v1/history/mapreduce/jobs/< JOB_ID>/conf

由此您将获得所有配置.对于查询,您需要查看 hive.query.string

From this you'll get all configs. For query you need look hive.query.string

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html#History_Server_REST_APIs .

我们还可以从hdfs位置提取作业的JSON和XML格式的作业配置.为此,您需要 mapreduce.jobhistory.done-dir 属性的值.

We can also extract the job's JSON and configuration of that job in XML from the hdfs location. For this, you need the value of the mapreduce.jobhistory.done-dir property.

然后您启动hdfs get命令以获取数据.

Then you fire hdfs get command to get the data.

hdfs dfs -get< resource-manager-path>/< year-dir>/< month-dir>/< day-dir>< destination-local-dir>

这篇关于我们能否提取通过元数据在蜂巢上运行的查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆