将xml数据加载到配置单元表中:org.apache.hadoop.hive.ql.metadata.HiveException [英] Loading xml data into hive table :org.apache.hadoop.hive.ql.metadata.HiveException

查看:223
本文介绍了将xml数据加载到配置单元表中:org.apache.hadoop.hive.ql.metadata.HiveException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将XML数据加载到Hive中,但出现错误:


java.lang.RuntimeException:org .apache.hadoop.hive.ql.metadata.HiveException:Hive处理行{xmldata:}运行时错误


我使用的xml文件是:

 <?xml version =1.0encoding =UTF-8?> ; 
< catalog>
< book>
< id> 11< / id>
<类型>计算机< /类型>
<价格> 44< / price>
< / book>
< book>
< id> 44< / id>
< genre> Fantasy< / genre>
<价格> 5< /价格>
< / book>
< / catalog>

我使用的配置单元查询是:

  1)创建TABLE xmltable(xmldata string)STORED AS TEXTFILE; 
LOAD DATA lOCAL INPATH'/home/user/xmlfile.xml'OVERWRITE INTO TABLE xmltable;
$ b $ 2)CREATE VIEW xmlview(id,genre,price)
AS SELECT
xpath(xmldata,'/ catalog [1] / book [1] / id'),
xpath(xmldata,'/ catalog [1] / book [1] / genre'),
xpath(xmldata,'/ catalog [1] / book [1] / price')
FROM xmltable;

3)CREATE TABLE xmlfinal AS SELECT * FROM xmlview;

4)SELECT * FROM xmlfinal WHERE id = '11

直到第二查询一切都很好,但当我执行第三个查询时,它给了我错误:

错误如下:

  java.lang.RuntimeException:org.apache.hadoop.hive.ql.metadata.HiveException:处理行时发生Hive运行时错误{xmldata:<?xml version = \\ \\1.0 \encoding = \UTF-8 \?>} 
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child $ 4.run(Child.java:268)$ b $ at java.security .AccessController.doPrivileged(Native方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.d oAs(UserGroupInformation.java:1438)
在org.apache.hadoop.mapred.Child.main(Child.java:262)
引起:org.apache.hadoop.hive.ql.metadata。 HiveException:在处理行{xmldata:<?xml version = \1.0 \encoding = \UTF-8 \?>>} $ or $ {
}时发生Hive运行时错误。 apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
at org.apache.hadoop.hive.ql.exec

FAILED:执行错误,返回代码2来自org.apache.hadoop.hive.ql.exec.MapRedTask

所以哪里出问题了?



谢谢,
Shree

解决方案

在这里找到Jar - > Brickhouse



示例示例 - > 例子



在stackoverflow中有类似的例子 - here



解决方案:

   - 将xml数据加载到表
DROP表xmltable;
创建TABLE xmltable(xmldata字符串)STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH'/home/vijay/data-input.xml'覆盖表格xmltable;

- 检查内容
SELECT * from xmltable;

- 创建视图
删除视图MyxmlView;
CREATE VIEW MyxmlView(id,genre,price)AS
SELECT
xpath(xmldata,'catalog / book / id / text()'),
xpath(xmldata,'目录/ book / genre / text()'),
xpath(xmldata,'catalog / book / price / text()')
FROM xmltable;

- 检查视图
SELECT id,genre,price FROM MyxmlView;


添加jar /home/vijay/brickhouse-0.7.0-SNAPSHOT.jar; - 添加brickhouse jar

CREATE TEMPORARY FUNCTION array_index AS'brickhouse.udf.collect.ArrayIndexUDF';
CREATE TEMPORARY FUNCTION numeric_range AS'brickhouse.udf.collect.NumericRange';

SELECT
array_index(id,n)as my_id,
array_index(genre,n)as my_genre,
array_index(price,n)as my_price
从MyxmlView
横向视图numeric_range(size(id))MyxmlView as n;

输出:

  hive> SELECT 
> array_index(id,n)as my_id,
> array_index(genre,n)as my_genre,
> array_index(price,n)as my_price
>来自MyxmlView
>横向视图numeric_range(size(id))MyxmlView as n;
为查询
自动选择仅本地模式Total MapReduce作业总数= 1
启动作业1满分1
因为没有减少操作符$ b $,所以reduce任务的数量设置为0 b执行日志位于:/tmp/vijay/.log
正在运行的进程内(本地Hadoop)
Hadoop作业信息为null:映射器数量:0;减数计数:0
2014-07-09 05:36:45,220 null map = 0%,reduce = 0%
2014-07-09 05:36:48,226 null map = 100%,reduce = 0%
完成的Job = job_local_0001
执行成功完成
Mapred本地任务成功。将Join加入MapJoin
OK
my_id my_genre my_price
11电脑44
44幻想5

所用时间:8.541秒,提取:2行

添加更多信息根据问题所有者的要求:





I'm trying to load XML data into Hive but I'm getting an error :

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":""}

The xml file i have used is :

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book>
  <id>11</id>
  <genre>Computer</genre>
  <price>44</price>
</book>
<book>
  <id>44</id>
  <genre>Fantasy</genre>
  <price>5</price>
</book>
</catalog>

The hive query i have used is :

1) Create TABLE xmltable(xmldata string) STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH '/home/user/xmlfile.xml' OVERWRITE INTO TABLE xmltable;

2) CREATE VIEW xmlview (id,genre,price)
AS SELECT
xpath(xmldata, '/catalog[1]/book[1]/id'),
xpath(xmldata, '/catalog[1]/book[1]/genre'),
xpath(xmldata, '/catalog[1]/book[1]/price')
FROM xmltable;

3) CREATE TABLE xmlfinal AS SELECT * FROM xmlview;

4) SELECT * FROM xmlfinal WHERE id ='11

Till 2nd query everything is fine but when i executed the 3rd query it's giving me error:

The error is as below:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"}
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error    while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
    at org.apache.hadoop.hive.ql.exec

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

So where it's going wrong? Also I'm using the proper xml file.

Thanks, Shree

解决方案

Find Jar here -- > Brickhouse ,

sample example here --> Example

similar example in stackoverflow - here

Solution:

--Load xml data to table
DROP table xmltable;
Create TABLE xmltable(xmldata string) STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH '/home/vijay/data-input.xml' OVERWRITE INTO TABLE xmltable;

-- check contents
SELECT * from xmltable;

-- create view
Drop view  MyxmlView;
CREATE VIEW MyxmlView(id, genre, price) AS
SELECT
 xpath(xmldata, 'catalog/book/id/text()'),
 xpath(xmldata, 'catalog/book/genre/text()'),
 xpath(xmldata, 'catalog/book/price/text()')
FROM xmltable;

-- check view
SELECT id, genre,price FROM MyxmlView;


ADD jar /home/vijay/brickhouse-0.7.0-SNAPSHOT.jar;  --Add brickhouse jar 

CREATE TEMPORARY FUNCTION array_index AS 'brickhouse.udf.collect.ArrayIndexUDF';
CREATE TEMPORARY FUNCTION numeric_range AS 'brickhouse.udf.collect.NumericRange';

SELECT 
   array_index( id, n ) as my_id,
   array_index( genre, n ) as my_genre,
   array_index( price, n ) as my_price
from MyxmlView
lateral view numeric_range( size( id )) MyxmlView as n;

Output:

hive > SELECT
     >    array_index( id, n ) as my_id,
     >    array_index( genre, n ) as my_genre,
     >    array_index( price, n ) as my_price
     > from MyxmlView
     > lateral view numeric_range( size( id )) MyxmlView as n;
Automatically selecting local only mode for query
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Execution log at: /tmp/vijay/.log
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2014-07-09 05:36:45,220 null map = 0%,  reduce = 0%
2014-07-09 05:36:48,226 null map = 100%,  reduce = 0%
Ended Job = job_local_0001
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
OK
my_id      my_genre      my_price
11      Computer        44
44      Fantasy 5

Time taken: 8.541 seconds, Fetched: 2 row(s)

Adding-more-info as requested by Question owner:

这篇关于将xml数据加载到配置单元表中:org.apache.hadoop.hive.ql.metadata.HiveException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆