我们能否提取通过元数据在蜂巢上运行的查询 [英] Can we extract the queries which was ran on hive through metadata
问题描述
基本上,我希望一天之内在配置单元上运行的查询的元数据.我研究了Hive在MySql中提供的元数据.但是找不到任何存储查询相关信息的表.
Basically, I want metadata of queries that ran on the hive in one day. I looked into the metadata which is provided by hive in MySql. But not able to find any table which stores query related information.
推荐答案
进行了一些研究后发现,我们可以使用用于Hadoop的History Server REST API提取MapReduce作业.
After doing some research found that we can extract the MapReduce jobs using the History Server REST API for Hadoop.
然后,您将获得与Job相关的信息.
And From that you'll the the Job related information.
要获取查询,您需要请求特定作业的配置文件
To get the query you need request for particular job's conf
< history_server_http_address:port>/ws/v1/history/mapreduce/jobs/< JOB_ID>/conf
由此您将获得所有配置.对于查询,您需要查看 hive.query.string
From this you'll get all configs. For query you need look hive.query.string
我们还可以从hdfs位置提取作业的JSON和XML格式的作业配置.为此,您需要 mapreduce.jobhistory.done-dir 属性的值.
We can also extract the job's JSON and configuration of that job in XML from the hdfs location. For this, you need the value of the mapreduce.jobhistory.done-dir property.
然后您启动hdfs get命令以获取数据.
Then you fire hdfs get command to get the data.
hdfs dfs -get< resource-manager-path>/< year-dir>/< month-dir>/< day-dir>< destination-local-dir>
这篇关于我们能否提取通过元数据在蜂巢上运行的查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!