无法从配置单元客户端找到由oozie配置单元操作创建的表,但可以在HDFS中找到它们 [英] Tables created by oozie hive action cannot be found from hive client but can find them in HDFS

查看:84
本文介绍了无法从配置单元客户端找到由oozie配置单元操作创建的表,但可以在HDFS中找到它们的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过Oozie Hive Action运行配置单元脚本,我在我的script.q中创建了一个配置单元表'test',oozie作业成功运行,我可以在hdfs路径下找到由oozie作业创建的表/用户/蜂巢/仓库。但是我无法通过Hive Client中的命令show tables找到'test'表。

我认为我的metastore配置有问题,但我无法弄清楚。
有人可以帮忙吗?

  oozie admin -oozie http:// localhost:11000 / oozie -status 

系统模式:NORMAL

  oozie job -oozie http:// localhost:11000 / oozie -config C:\ Hadoop \oozie-3.2.0-incubating\oozie-win-distro\examples\apps\hive\\ \\ job.properties -run 

作业ID:0000001-130910094106919-oozie-hado-W



运行结果



这是我的oozie-site.xml







  http://www.apache.org/licenses/LICENSE-2.0 

除非适用法律要求或书面同意,否则根据许可证分发的软件
将以原样基础,
无任何明示或暗示的保证或条件。
请参阅许可证以了解许可证下特定语言的管理权限和
限制。
- >

 <! -  
请参阅oozie-default .xml文件的完整列表
Oozie配置属性及其默认值。
- >

<属性>
< name> oozie.service.ActionService.executor.ext.classes< / name>
<值>
org.apache.oozie.action.email.EmailActionExecutor,
org.apache.oozie.action.hadoop.HiveActionExecutor,
org.apache.oozie.action.hadoop.ShellActionExecutor,
org.apache.oozie.action.hadoop.SqoopActionExecutor
< / value>
< / property>

<属性>
< name> oozie.service.SchemaService.wf.ext.schemas< / name>
< value> shell-action-0.1.xsd,email-action-0.1.xsd,hive-action-0.2.xsd,sqoop-action-0.2.xsd,ssh-action-0.1.xsd< / value> ;
< / property>

<属性>
< name> oozie.system.id< / name>
< value> oozie - $ {user.name}< /值>
< description>
Oozie系统ID。
< / description>
< / property>

<属性>
<名称> oozie.systemmode< / name>
<值> NORMAL< /值>
< description>
启动时Oozie的系统模式。
< / description>
< / property>

<属性>
< name> oozie.service.AuthorizationService.security.enabled< / name>
<值> false< /值>
< description>
指定是否启用安全性(用户名/管理员角色)。
如果禁用,任何用户都可以管理Oozie系统并管理任何工作。
< / description>
< / property>

<属性>
< name> oozie.service.PurgeService.older.than< / name>
<值> 30< /值>
< description>
以天为单位的此值以上的作业将由PurgeService清除。
< / description>
< / property>

<属性>
<名称> oozie.service.PurgeService.purge.interval< / name>
<值> 3600< /值>
< description>
清除服务将运行的时间间隔,以秒为单位。
< / description>
< / property>

<属性>
< name> oozie.service.CallableQueueService.queue.size< / name>
<值> 10000< /值>
< description>最大可调用队列大小< / description>
< / property>

<属性>
< name> oozie.service.CallableQueueService.threads< / name>
< value> 10< /值>
< description>用于执行可调用对象的线程数< / description>
< / property>

<属性>
< name> oozie.service.CallableQueueService.callable.concurrency< / name>
<值> 3< /值>
< description>
给定可调用类型的最大并发性。
每个命令都是可调用的类型(提交,启动,运行,信号,作业,作业,暂停,恢复等)。
每个动作类型都是可调用的类型(Map-Reduce,Pig,SSH,FS,子工作流等)。
所有使用动作执行程序(action-start,action-end,action-kill和action-check)的命令都使用
动作类型作为可调用类型。
< / description>
< / property>

<属性>
<名称> oozie.service.coord.normal.default.timeout
< / name>
<值> 120< /值>
< description>正常作业的协调器操作输入检查的默认超时(分钟)。
-1表示无限超时< / description>
< / property>

<属性>
<名称> oozie.db.schema.name< /名称>
<值> oozie< /值>
< description>
Oozie DataBase名称
< / description>
< / property>

<属性>
< name> oozie.service.JPAService.create.db.schema< / name>
<值> true< /值>
< description>
创建Oozie DB。

如果设置为true,它将创建数据库模式(如果它不存在)。如果数据库模式存在是NOP。
如果设置为false,它不会创建数据库模式。如果数据库模式不存在,则启动失败。
< / description>
< / property>

<属性>
< name> oozie.service.JPAService.jdbc.driver< / name>
< value> org.apache.derby.jdbc.EmbeddedDriver< / value>
< description>
JDBC驱动程序类。
< / description>
< / property>

<属性>
< name> oozie.service.JPAService.jdbc.url< / name>
< value> jdbc:derby:$ {oozie.data.dir} / $ {oozie.db.schema.name} -db; create = true< / value>
< description>
JDBC URL。
< / description>
< / property>

<属性>
< name> oozie.service.JPAService.jdbc.username< / name>
<值> sa< /值>
< description>
数据库用户名。
< / description>
< / property>

<属性>
<名称> oozie.service.JPAService.jdbc.password< / name>
<值> pwd< /值>
< description>
数据库用户密码。

重要提示:如果密码为空,则保留一个空格字符串,服务将修剪该值,
如果为空配置假定为空。
< / description>
< / property>

<属性>
< name> oozie.service.JPAService.pool.max.active.conn< / name>
< value> 10< /值>
< description>
最大连接数。
< / description>
< / property>

<属性>
<名称> oozie.service.HadoopAccessorService.kerberos.enabled< / name>
<值> false< /值>
< description>
表示Oozie是否配置为使用Kerberos。
< / description>
< / property>

<属性>
< name> local.realm< / name>
<值> LOCALHOST< /值>
< description>
Oozie和Hadoop使用的Kerberos Realm。使用'local.realm'与Hadoop配置一致
< / description>
< / property>

<属性>
<名称> oozie.service.HadoopAccessorService.keytab.file< / name>
< value> $ {user.home} /oozie.keytab< / value>
< description>
Oozie用户密钥表文件的位置。
< / description>
< / property>

<属性>
< name> oozie.service.HadoopAccessorService.kerberos.principal< / name>
<值> $ {user.name} / localhost @ $ {local.realm}< /值>
< description>
Oozie服务的Kerberos主体。
< / description>
< / property>

<属性>
<名称> oozie.service.HadoopAccessorService.jobTracker.whitelist< /名称>
<值> < /值GT;
< description>
Oozie服务的白名单作业追踪器。
< / description>
< / property>

<属性>
<名称> oozie.service.HadoopAccessorService.nameNode.whitelist< /名称>
<值> < /值GT;
< description>
Oozie服务的白名单作业追踪器。
< / description>
< / property>

<属性>
<名称> oozie.service.HadoopAccessorService.hadoop.configurations< / name>
< value> * = hadoop-conf< /值>
< description>
逗号分隔的AUTHORITY = HADOOP_CONF_DIR,其中AUTHORITY是
Hadoop服务(JobTracker,HDFS)的HOST:PORT。当没有完全匹配某个权限时,使用通配符'*'的配置是
。 HADOOP_CONF_DIR包含
相关的Hadoop * -site.xml文件。如果路径是相对的,则在
Oozie配置目录内查看;尽管路径可以是绝对路径(即将
指向本地文件系统中的Hadoop客户端conf /目录)
< / description>
< / property>

< property>
< name> oozie.service.WorkflowAppService.system.libpath< / name>
<值> / user / $ {user.name} / share / lib< / value> ;
<描述>
用于工作流程应用程序的系统库路径
如果工作属性设置
属性'oozie.use.system .bibpath'为true。



<属性>
<名称> use.system.libpath。 for.mapreduce.and.pig.jobs< / name>
<值> false< /值>
<描述>
如果设置为true,则提交MapReduce和Pig作业自动包含
的系统库路径,因此不需要用户
指定Pig JAR文件的位置。而是使用系统
库路径中的路径。
< / description>
< / property>

<属性>
<名称> oozie.authentication.type< /名称>
<值>简单< /值>
< description>
定义用于Oozie HTTP端点的认证。
支持的值是:simple | kerberos | #AUTHENTICATION_HANDLER_CLASSNAME#
< / description>
< / property>

<属性>
<名称> oozie.authentication.token.validity< / name>
<值> 36000< /值>
< description>
表示身份验证令牌在更新
之前有效的时间(以秒为单位)。
< / description>
< / property>

<属性>
<名称> oozie.authentication.signature.secret< / name>
<值> oozie< /值>
< description>
用于签署认证令牌的签名秘密。
如果未设置随机密钥,则在启动时生成。
为了使认证在多个主机
上正常工作,密钥必须在所有主机上保持一致。
< / description>
< / property>

<属性>
<名称> oozie.authentication.cookie.domain< / name>
<值>< /值>
< description>
用于存储身份验证令牌的HTTP cookie的域。
为了使验证在多个主机
之间正确工作,必须正确设置域。
< / description>
< / property>

<属性>
<名称> oozie.authentication.simple.anonymous.allowed< /名称>
<值> true< /值>
< description>
表示是否允许匿名请求。
只有使用'简单'认证时,此设置才有意义。
< / description>
< / property>

<属性>
< name> oozie.authentication.kerberos.principal< / name>
< value> HTTP / localhost @ $ {local.realm}< /值>
< description>
表示要用于HTTP端点的Kerberos主体。
根据Kerberos HTTP SPNEGO规范,主体必须以'HTTP /'开头。
< / description>
< / property>

<属性>
<名称> oozie.authentication.kerberos.keytab< / name>
< value> $ {oozie.service.HadoopAccessorService.keytab.file}< /值>
< description>
keytab文件的位置和主体的凭证。
引用Oozie为其Hadoop的Kerberos凭证使用的同一个密钥表文件。
< / description>
< / property>

<属性>
< name> oozie.authentication.kerberos.name.rules< / name>
<值> DEFAULT< /值>
< description>
kerberos名称规则用于解析kerberos主体名称,有关更多详细信息,请参阅Hadoop的
KerberosName。
< / description>
< / property>

<! - Proxyuser配置 - >

<! - -

<属性>
< name> oozie.service.ProxyUserService.proxyuser。#USER#.hosts< / name>
<值> *< /值>
< description>
允许'#USER#'用户执行'doAs'
操作的主机列表。

'#USER#'必须替换为用户名$ o
允许用户执行'doAs'操作。

该值可以是'*'通配符或主机名列表。

对于多个用户复制此属性并替换属性名称中的用户名

< / description>
< / property>

<属性>
< name> oozie.service.ProxyUserService.proxyuser。#USER#.groups< / name>
<值> *< /值>
< description>
允许'#USER#'用户模拟用户
执行'doAs'操作的组列表。

'#USER#'必须替换为用户名$ o
允许用户执行'doAs'操作。

该值可以是'*'通配符或组列表。

对于多个用户复制此属性并替换属性名称中的用户名

< / description>
< / property>

- >






这是我的hive-site.xml




[hive-site.xml]



这里是我的script.q




创建表test(id int);


解决方案

在你的oozie hive动作中,你需要告诉oozie你的hive metastore在哪里。

表示你需要通过hive-site.xml作为参数。



您还需要为配置单元配置外部元数据才能工作。默认的derby数据库配置不适合您。



所以在简单的步骤中

用外部创建配置单元设置数据库,比如mysql
将hive-site.xml传递给oozie动作

详情请看这里

http://oozie.apache.org/docs/3.3.1/DG_HiveActionExtension .html



谢谢

I'm trying to run hive script via Oozie Hive Action, I just created a hive table 'test' in my script.q , and the oozie job ran successed, I can find the table created by oozie job under hdfs path /user/hive/warehouse. But I could not find the 'test' table via command "show tables" in Hive Client.

I think there is something wrong with my metastore config, but I just can't figure it out. Can somebody help ?

oozie admin -oozie http://localhost:11000/oozie -status

System mode: NORMAL

oozie job -oozie http://localhost:11000/oozie -config C:\Hadoop\oozie-3.2.0-incubating\oozie-win-distro\examples\apps\hive\job.properties -run

Job ID : 0000001-130910094106919-oozie-hado-W

Run Result

Here is my oozie-site.xml


   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->

<!--
    Refer to the oozie-default.xml file for the complete list of
    Oozie configuration properties and their default values.
-->

<property>
    <name>oozie.service.ActionService.executor.ext.classes</name>
    <value>
        org.apache.oozie.action.email.EmailActionExecutor,
        org.apache.oozie.action.hadoop.HiveActionExecutor,
        org.apache.oozie.action.hadoop.ShellActionExecutor,
        org.apache.oozie.action.hadoop.SqoopActionExecutor
    </value>
</property>

<property>
    <name>oozie.service.SchemaService.wf.ext.schemas</name>
    <value>shell-action-0.1.xsd,email-action-0.1.xsd,hive-action-0.2.xsd,sqoop-action-0.2.xsd,ssh-action-0.1.xsd</value>
</property>

<property>
    <name>oozie.system.id</name>
    <value>oozie-${user.name}</value>
    <description>
        The Oozie system ID.
    </description>
</property>

<property>
    <name>oozie.systemmode</name>
    <value>NORMAL</value>
    <description>
        System mode for  Oozie at startup.
    </description>
</property>

<property>
    <name>oozie.service.AuthorizationService.security.enabled</name>
    <value>false</value>
    <description>
        Specifies whether security (user name/admin role) is enabled or not.
        If disabled any user can manage Oozie system and manage any job.
    </description>
</property>

<property>
    <name>oozie.service.PurgeService.older.than</name>
    <value>30</value>
    <description>
        Jobs older than this value, in days, will be purged by the PurgeService.
    </description>
</property>

<property>
    <name>oozie.service.PurgeService.purge.interval</name>
    <value>3600</value>
    <description>
        Interval at which the purge service will run, in seconds.
    </description>
</property>

<property>
    <name>oozie.service.CallableQueueService.queue.size</name>
    <value>10000</value>
    <description>Max callable queue size</description>
</property>

<property>
    <name>oozie.service.CallableQueueService.threads</name>
    <value>10</value>
    <description>Number of threads used for executing callables</description>
</property>

<property>
    <name>oozie.service.CallableQueueService.callable.concurrency</name>
    <value>3</value>
    <description>
        Maximum concurrency for a given callable type.
        Each command is a callable type (submit, start, run, signal, job, jobs, suspend,resume, etc).
        Each action type is a callable type (Map-Reduce, Pig, SSH, FS, sub-workflow, etc).
        All commands that use action executors (action-start, action-end, action-kill and action-check) use
        the action type as the callable type.
    </description>
</property>

<property>
    <name>oozie.service.coord.normal.default.timeout
    </name>
    <value>120</value>
    <description>Default timeout for a coordinator action input check (in minutes) for normal job.
        -1 means infinite timeout</description>
</property>

<property>
    <name>oozie.db.schema.name</name>
    <value>oozie</value>
    <description>
        Oozie DataBase Name
    </description>
</property>

<property>
    <name>oozie.service.JPAService.create.db.schema</name>
    <value>true</value>
    <description>
        Creates Oozie DB.

        If set to true, it creates the DB schema if it does not exist. If the DB schema exists is a NOP.
        If set to false, it does not create the DB schema. If the DB schema does not exist it fails start up.
    </description>
</property>

<property>
    <name>oozie.service.JPAService.jdbc.driver</name>
    <value>org.apache.derby.jdbc.EmbeddedDriver</value>
    <description>
        JDBC driver class.
    </description>
</property>

<property>
    <name>oozie.service.JPAService.jdbc.url</name>
    <value>jdbc:derby:${oozie.data.dir}/${oozie.db.schema.name}-db;create=true</value>
    <description>
        JDBC URL.
    </description>
</property>

<property>
    <name>oozie.service.JPAService.jdbc.username</name>
    <value>sa</value>
    <description>
        DB user name.
    </description>
</property>

<property>
    <name>oozie.service.JPAService.jdbc.password</name>
    <value>pwd</value>
    <description>
        DB user password.

        IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
                   if empty Configuration assumes it is NULL.
    </description>
</property>

<property>
    <name>oozie.service.JPAService.pool.max.active.conn</name>
    <value>10</value>
    <description>
         Max number of connections.
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.kerberos.enabled</name>
    <value>false</value>
    <description>
        Indicates if Oozie is configured to use Kerberos.
    </description>
</property>

<property>
    <name>local.realm</name>
    <value>LOCALHOST</value>
    <description>
        Kerberos Realm used by Oozie and Hadoop. Using 'local.realm' to be aligned with Hadoop configuration
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.keytab.file</name>
    <value>${user.home}/oozie.keytab</value>
    <description>
        Location of the Oozie user keytab file.
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.kerberos.principal</name>
    <value>${user.name}/localhost@${local.realm}</value>
    <description>
        Kerberos principal for Oozie service.
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.jobTracker.whitelist</name>
    <value> </value>
    <description>
        Whitelisted job tracker for Oozie service.
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.nameNode.whitelist</name>
    <value> </value>
    <description>
        Whitelisted job tracker for Oozie service.
    </description>
</property>

<property>
    <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
    <value>*=hadoop-conf</value>
    <description>
        Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
        the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
        used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
        the relevant Hadoop *-site.xml files. If the path is relative is looked within
        the Oozie configuration directory; though the path can be absolute (i.e. to point
        to Hadoop client conf/ directories in the local filesystem.
    </description>
</property>

<property>
    <name>oozie.service.WorkflowAppService.system.libpath</name>
    <value>/user/${user.name}/share/lib</value>
    <description>
        System library path to use for workflow applications.
        This path is added to workflow application if their job properties sets
        the property 'oozie.use.system.libpath' to true.
    </description>
</property>

<property>
    <name>use.system.libpath.for.mapreduce.and.pig.jobs</name>
    <value>false</value>
    <description>
        If set to true, submissions of MapReduce and Pig jobs will include
        automatically the system library path, thus not requiring users to
        specify where the Pig JAR files are. Instead, the ones from the system
        library path are used.
    </description>
</property>

<property>
    <name>oozie.authentication.type</name>
    <value>simple</value>
    <description>
        Defines authentication used for Oozie HTTP endpoint.
        Supported values are: simple | kerberos | #AUTHENTICATION_HANDLER_CLASSNAME#
    </description>
</property>

<property>
    <name>oozie.authentication.token.validity</name>
    <value>36000</value>
    <description>
        Indicates how long (in seconds) an authentication token is valid before it has
        to be renewed.
    </description>
</property>

<property>
    <name>oozie.authentication.signature.secret</name>
    <value>oozie</value>
    <description>
        The signature secret for signing the authentication tokens.
        If not set a random secret is generated at startup time.
        In order to authentiation to work correctly across multiple hosts
        the secret must be the same across al the hosts.
    </description>
</property>

<property>
  <name>oozie.authentication.cookie.domain</name>
  <value></value>
  <description>
    The domain to use for the HTTP cookie that stores the authentication token.
    In order to authentiation to work correctly across multiple hosts
    the domain must be correctly set.
  </description>
</property>

<property>
    <name>oozie.authentication.simple.anonymous.allowed</name>
    <value>true</value>
    <description>
        Indicates if anonymous requests are allowed.
        This setting is meaningful only when using 'simple' authentication.
    </description>
</property>

<property>
    <name>oozie.authentication.kerberos.principal</name>
    <value>HTTP/localhost@${local.realm}</value>
    <description>
        Indicates the Kerberos principal to be used for HTTP endpoint.
        The principal MUST start with 'HTTP/' as per Kerberos HTTP SPNEGO specification.
    </description>
</property>

<property>
    <name>oozie.authentication.kerberos.keytab</name>
    <value>${oozie.service.HadoopAccessorService.keytab.file}</value>
    <description>
        Location of the keytab file with the credentials for the principal.
        Referring to the same keytab file Oozie uses for its Kerberos credentials for Hadoop.
    </description>
</property>

<property>
    <name>oozie.authentication.kerberos.name.rules</name>
    <value>DEFAULT</value>
    <description>
        The kerberos names rules is to resolve kerberos principal names, refer to Hadoop's
        KerberosName for more details.
    </description>
</property>

<!-- Proxyuser Configuration -->

<!--

<property>
    <name>oozie.service.ProxyUserService.proxyuser.#USER#.hosts</name>
    <value>*</value>
    <description>
        List of hosts the '#USER#' user is allowed to perform 'doAs'
        operations.

        The '#USER#' must be replaced with the username o the user who is
        allowed to perform 'doAs' operations.

        The value can be the '*' wildcard or a list of hostnames.

        For multiple users copy this property and replace the user name
        in the property name.
    </description>
</property>

<property>
    <name>oozie.service.ProxyUserService.proxyuser.#USER#.groups</name>
    <value>*</value>
    <description>
        List of groups the '#USER#' user is allowed to impersonate users
        from to perform 'doAs' operations.

        The '#USER#' must be replaced with the username o the user who is
        allowed to perform 'doAs' operations.

        The value can be the '*' wildcard or a list of groups.

        For multiple users copy this property and replace the user name
        in the property name.
    </description>
</property>

-->


Here is my hive-site.xml


[hive-site.xml]

Here is my script.q


create table test(id int);

解决方案

Inside your oozie hive action you need to tell oozie where your hive metastore is.

Means you need to pass your hive-site.xml as argument.

Also you need to configure external metastore for hive for it to work. The default derby database configuration will not work for you.

So in simple steps

Create hive settings with external database , say mysql Pass that hive-site.xml to oozie action

See here for details

http://oozie.apache.org/docs/3.3.1/DG_HiveActionExtension.html

Thanks

这篇关于无法从配置单元客户端找到由oozie配置单元操作创建的表,但可以在HDFS中找到它们的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆