如何将蜂房查询转换为抽象语法树? [英] how to convert a hive query into abstract syntax tree?

查看:199
本文介绍了如何将蜂房查询转换为抽象语法树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何人都可以告诉我如何将蜂房查询转换为抽象语法树?
例如:select * from orders where cust_num = 100;
我如何将它转换成AST?以及如何将此AST转换为QB树?请帮忙。您可以使用 EXPLAIN 命令与 EXTENDED 。假设我有一个名为 demo 的表,其中列为 n1 ,然后发出Explain会给我这个:

  hive> EXPLAIN EXTENDED select * from demo where n1 ='aaa'; 
OK
抽象语法树:
(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME演示)))(TOK_INSERT(TOK_DESTINATION(TOK_DIR TOK_TMP_FILE))(TOK_SELECT(TOK_SELEXPR TOK_ALLCOLREF))(TOK_WHERE(= TOK_TABLE_OR_COL n1)'aaa')))

STAGE相关性:
Stage-1是根阶段
Stage-0是根阶段

舞台计划:
舞台:舞台1
地图减少
别名 - >映射运算符树:
demo
TableScan
别名:demo
GatherStats:false
过滤运算符
isSamplingPred:false
谓词:
表达式:(n1 ='aaa')
类型:布尔型
选择操作符
表达式:
表达式:n1
类型:字符串
表达式:n2
类型:字符串
outputColumnNames:_col0,_col1
文件输出操作符
压缩:false
GlobalTableId:0
目录:hdfs:// localhost:9000 / tmp / hive-apache / hive_2013-06-13_19-55-21_578_6086176948010779575 / -ext-10001
NumFilesPerFileSink:1
统计信息发布关键字前缀:hdfs:// localhost:9000 / tmp / hive-apache / hive_2013-06-13_19-5 5-21_578_6086176948010779575 / -ext-10001 /
表格:
输入格式:org.apache.hadoop.mapred.TextInputFormat
输出格式:org.apache.hadoop.hive.ql.io。 HiveIgnoreKeyTextOutputFormat
属性:
列_col0,_col1
列表。字符串:字符串
escape.delim \
serialization.format 1
TotalFiles:1
GatherStats:false
MultiFileSpray:false
需要标记:false
路径 - >别名:
hdfs:// localhost:9000 / user / hive / warehouse / demo [demo]
路径 - >分区:
hdfs:// localhost:9000 / user / hive / warehouse / demo
分区
基本文件名:demo
输入格式:org.apache.hadoop.mapred。 TextInputFormat
输出格式:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
属性:
bucket_count -1
列n1,n2
columns.types字符串:string
field.delim,
file.inputformat org.apache.hadoop.mapred.TextInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
位置hdfs:// localhost:9000 / user / hive / warehouse / demo
name default.demo
serialization.ddl struct demo {string n1,string n2}
serialization.format,
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
transient_lastDdlT ime 1370932655
serde:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

输入格式:org.apache.hadoop.mapred.TextInputFormat
输出格式:org。 apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
属性:
bucket_count -1
列n1,n2
列表。字符串:字符串
field.delim,
file.inputformat org.apache.hadoop.mapred.TextInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
位置hdfs:// localhost:9000 / user / hive / warehouse / demo
name default.demo
serialization.ddl struct demo {string n1,string n2}
serialization.format,
serialization.lib org.apache.hadoop .hive.serde2.lazy.LazySimpleSerDe
transient_lastDdlTime 13709 32655
serde:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name:default.demo
name:default.demo

阶段:Stage- 0
提取操作员
限制:-1


花费的时间:5.316秒


Can anyone tell me how to convert a hive query into abstract syntax tree? For example: select * from orders where cust_num = 100; How can i convert this into AST? and how can i convert this AST into a QB tree? please help. Thanks in advance.

解决方案

You can make use of EXPLAIN command with EXTENDED. Suppose I have a table called demo with a column as n1, then issuing Explain would give me this :

hive> EXPLAIN EXTENDED select * from demo where n1='aaa';
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME demo))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL n1) 'aaa'))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        demo 
          TableScan
            alias: demo
            GatherStats: false
            Filter Operator
              isSamplingPred: false
              predicate:
                  expr: (n1 = 'aaa')
                  type: boolean
              Select Operator
                expressions:
                      expr: n1
                      type: string
                      expr: n2
                      type: string
                outputColumnNames: _col0, _col1
                File Output Operator
                  compressed: false
                  GlobalTableId: 0
                  directory: hdfs://localhost:9000/tmp/hive-apache/hive_2013-06-13_19-55-21_578_6086176948010779575/-ext-10001
                  NumFilesPerFileSink: 1
                  Stats Publishing Key Prefix: hdfs://localhost:9000/tmp/hive-apache/hive_2013-06-13_19-55-21_578_6086176948010779575/-ext-10001/
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      properties:
                        columns _col0,_col1
                        columns.types string:string
                        escape.delim \
                        serialization.format 1
                  TotalFiles: 1
                  GatherStats: false
                  MultiFileSpray: false
      Needs Tagging: false
      Path -> Alias:
        hdfs://localhost:9000/user/hive/warehouse/demo [demo]
      Path -> Partition:
        hdfs://localhost:9000/user/hive/warehouse/demo 
          Partition
            base file name: demo
            input format: org.apache.hadoop.mapred.TextInputFormat
            output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
            properties:
              bucket_count -1
              columns n1,n2
              columns.types string:string
              field.delim ,
              file.inputformat org.apache.hadoop.mapred.TextInputFormat
              file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              location hdfs://localhost:9000/user/hive/warehouse/demo
              name default.demo
              serialization.ddl struct demo { string n1, string n2}
              serialization.format ,
              serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              transient_lastDdlTime 1370932655
            serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              properties:
                bucket_count -1
                columns n1,n2
                columns.types string:string
                field.delim ,
                file.inputformat org.apache.hadoop.mapred.TextInputFormat
                file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                location hdfs://localhost:9000/user/hive/warehouse/demo
                name default.demo
                serialization.ddl struct demo { string n1, string n2}
                serialization.format ,
                serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                transient_lastDdlTime 1370932655
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: default.demo
            name: default.demo

  Stage: Stage-0
    Fetch Operator
      limit: -1


Time taken: 5.316 seconds

这篇关于如何将蜂房查询转换为抽象语法树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆