Hive不完全遵守core-site.xml中的fs.default.name/fs.defaultFS值 [英] Hive not fully honoring fs.default.name/fs.defaultFS value in core-site.xml

查看:1548
本文介绍了Hive不完全遵守core-site.xml中的fs.default.name/fs.defaultFS值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在名为 hadoop 的计算机上安装了NameNode服务。



core-site.xml 文件包含 fs.defaultFS (相当于 fs.default.name

 <属性> 
<名称> fs.defaultFS< / name>
< value> hdfs:// hadoop:8020< / value>
< / property>

我有一个非常简单的表,名为 test_table 目前存在于HDFS上的Hive服务器上。也就是说,它存储在 / user / hive / warehouse / test_table 下。它使用Hive中的一个非常简单的命令创建:

如果我尝试在本地加载数据(即使用 LOAD DATA LOCAL ),一切都按预期进行。但是,如果数据存储在HDFS上,并且希望从那里加载,则会发生问题。



我运行一个非常简单的查询来尝试此加载: p>

  hive> LOAD DATA INPATH'/user/haduser/test_table.csv'INTO TABLE test_table; 

这样做会导致以下错误:

  FAILED:SemanticException [错误10028]:第1行17路径不合法''/user/haduser/test_table.csv'':
从:hdfs:/ /hadoop:8020/user/haduser/test_table.csv到:hdfs:// localhost:8020 / user / hive / warehouse / test_table无效。
请检查paramsdefault.fs.name和hive.metastore.warehouse.dir的值是否有冲突。

由于错误状态,它试图从 hdfs:// hadoop:8020 / user / haduser / test_table.csv hdfs:// localhost:8020 / user / hive / warehouse / test_table 。第一个路径是正确的,因为它引用了 hadoop:8020 ;第二个路径是不正确的,因为它引用了 localhost:8020


$ b core- site.xml 文件明确指出使用 hdfs:// hadoop:8020 hive-site.xml 中的 hive.metastore.warehouse 值正确指向 / user /蜂巢/仓库。因此,我怀疑这个错误消息是否有任何真正的价值。



在创建表时,如何让Hive服务器使用正确的NameNode地址?

解决方案

我发现Hive Metastore跟踪每个表的位置。您可以在Hive控制台中看到该位置正在运行以下内容。

  hive> DESCRIBE EXTENDED test_table; 

因此,如果 core-site.xml中的NameNode 在Metastore服务仍在运行时已更改。因此,要解决此问题,应该在该计算机上重新启动服务:

  $ sudo服务配置单元重新启动

然后,metastore将使用新的 fs.defaultFS 新建表等。

已有表



已存在的表的位置可以通过运行以下一组命令。这些是从Cloudera获得

  $ / usr / lib / hive / bin / metatool -listFSRoot 

列出FS Roots ..
hdfs:// localhost:8020 / user / hive / warehouse
hdfs:// localhost:8020 / user / hive / warehouse / test.db

更正NameNode位置:

  $ / usr / lib / hive / bin / metatool -updateLocation hdfs:// hadoop:8020 hdfs:// localhost:8020 
code>

现在列出的NameNode是正确的。

  $ / usr / lib / hive / bin / metatool -listFSRoot 
...
列出FS Roots ..
hdfs:// hadoop:8020 / user / hive / warehouse
hdfs:// hadoop:8020 / user / hive / warehouse / test.db


I have the NameNode service installed on a machine called hadoop.

The core-site.xml file has the fs.defaultFS (equivalent to fs.default.name) set to the following:

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://hadoop:8020</value>
</property>

I have a very simple table called test_table that currently exists in the Hive server on the HDFS. That is, it is stored under /user/hive/warehouse/test_table. It was created using a very simple command in Hive:

CREATE TABLE new_table (record_id INT);

If I attempt to load data into the table locally (that is, using LOAD DATA LOCAL), everything proceeds as expected. However, if the data is stored on the HDFS and I want to load from there, an issue occurs.

I run a very simple query to attempt this load:

hive> LOAD DATA INPATH '/user/haduser/test_table.csv' INTO TABLE test_table;

Doing so leads to the following error:

FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ''/user/haduser/test_table.csv'':
Move from: hdfs://hadoop:8020/user/haduser/test_table.csv to: hdfs://localhost:8020/user/hive/warehouse/test_table is not valid.
Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.

As the error states, it is attempting to move from hdfs://hadoop:8020/user/haduser/test_table.csv to hdfs://localhost:8020/user/hive/warehouse/test_table. The first path is correct because it references hadoop:8020; the second path is incorrect, because it references localhost:8020.

The core-site.xml file clearly states to use hdfs://hadoop:8020. The hive.metastore.warehouse value in hive-site.xml correctly points to /user/hive/warehouse. Thus, I doubt this error message has any true value.

How can I get the Hive server to use the correct NameNode address when creating tables?

解决方案

I found that the Hive metastore tracks the location of each table. You can see the that location be running the following in the Hive console.

hive> DESCRIBE EXTENDED test_table;

Thus, this issue occurs if the NameNode in core-site.xml was changed while the metastore service was still running. Therefore, to resolve this issue the service should be restarted on that machine:

$ sudo service hive-metastore restart

Then, the metastore will use the new fs.defaultFS for newly created tables such.

Already Existing Tables

The location for tables that already exist can be corrected by running the following set of commands. These were obtained from Cloudera documentation to configure the Hive metastore to use High-Availability.

$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://localhost:8020/user/hive/warehouse
hdfs://localhost:8020/user/hive/warehouse/test.db

Correcting the NameNode location:

$ /usr/lib/hive/bin/metatool -updateLocation hdfs://hadoop:8020 hdfs://localhost:8020

Now the listed NameNode is correct.

$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://hadoop:8020/user/hive/warehouse
hdfs://hadoop:8020/user/hive/warehouse/test.db

这篇关于Hive不完全遵守core-site.xml中的fs.default.name/fs.defaultFS值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆