将xml数据加载到配置单元表中:org.apache.hadoop.hive.ql.metadata.HiveException [英] Loading xml data into hive table :org.apache.hadoop.hive.ql.metadata.HiveException
问题描述
我试图将XML数据加载到Hive中,但出现错误:
java.lang.RuntimeException:org .apache.hadoop.hive.ql.metadata.HiveException:Hive处理行{xmldata:}运行时错误
我使用的xml文件是:
<?xml version =1.0encoding =UTF-8?> ;
< catalog>
< book>
< id> 11< / id>
<类型>计算机< /类型>
<价格> 44< / price>
< / book>
< book>
< id> 44< / id>
< genre> Fantasy< / genre>
<价格> 5< /价格>
< / book>
< / catalog>
我使用的配置单元查询是:
1)创建TABLE xmltable(xmldata string)STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH'/home/user/xmlfile.xml'OVERWRITE INTO TABLE xmltable;
$ b $ 2)CREATE VIEW xmlview(id,genre,price)
AS SELECT
xpath(xmldata,'/ catalog [1] / book [1] / id'),
xpath(xmldata,'/ catalog [1] / book [1] / genre'),
xpath(xmldata,'/ catalog [1] / book [1] / price')
FROM xmltable;
3)CREATE TABLE xmlfinal AS SELECT * FROM xmlview;
4)SELECT * FROM xmlfinal WHERE id = '11
直到第二查询一切都很好,但当我执行第三个查询时,它给了我错误:
错误如下:
java.lang.RuntimeException:org.apache.hadoop.hive.ql.metadata.HiveException:处理行时发生Hive运行时错误{xmldata:<?xml version = \\ \\1.0 \encoding = \UTF-8 \?>}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child $ 4.run(Child.java:268)$ b $ at java.security .AccessController.doPrivileged(Native方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.d oAs(UserGroupInformation.java:1438)
在org.apache.hadoop.mapred.Child.main(Child.java:262)
引起:org.apache.hadoop.hive.ql.metadata。 HiveException:在处理行{xmldata:<?xml version = \1.0 \encoding = \UTF-8 \?>>} $ or $ {
}时发生Hive运行时错误。 apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
at org.apache.hadoop.hive.ql.exec
FAILED:执行错误,返回代码2来自org.apache.hadoop.hive.ql.exec.MapRedTask
所以哪里出问题了?
谢谢,
Shree
在这里找到Jar - > Brickhouse ,
示例示例 - > 例子
在stackoverflow中有类似的例子 - here
解决方案:
- 将xml数据加载到表
DROP表xmltable;
创建TABLE xmltable(xmldata字符串)STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH'/home/vijay/data-input.xml'覆盖表格xmltable;
- 检查内容
SELECT * from xmltable;
- 创建视图
删除视图MyxmlView;
CREATE VIEW MyxmlView(id,genre,price)AS
SELECT
xpath(xmldata,'catalog / book / id / text()'),
xpath(xmldata,'目录/ book / genre / text()'),
xpath(xmldata,'catalog / book / price / text()')
FROM xmltable;
- 检查视图
SELECT id,genre,price FROM MyxmlView;
添加jar /home/vijay/brickhouse-0.7.0-SNAPSHOT.jar; - 添加brickhouse jar
CREATE TEMPORARY FUNCTION array_index AS'brickhouse.udf.collect.ArrayIndexUDF';
CREATE TEMPORARY FUNCTION numeric_range AS'brickhouse.udf.collect.NumericRange';
SELECT
array_index(id,n)as my_id,
array_index(genre,n)as my_genre,
array_index(price,n)as my_price
从MyxmlView
横向视图numeric_range(size(id))MyxmlView as n;
输出:
hive> SELECT
> array_index(id,n)as my_id,
> array_index(genre,n)as my_genre,
> array_index(price,n)as my_price
>来自MyxmlView
>横向视图numeric_range(size(id))MyxmlView as n;
为查询
自动选择仅本地模式Total MapReduce作业总数= 1
启动作业1满分1
因为没有减少操作符$ b $,所以reduce任务的数量设置为0 b执行日志位于:/tmp/vijay/.log
正在运行的进程内(本地Hadoop)
Hadoop作业信息为null:映射器数量:0;减数计数:0
2014-07-09 05:36:45,220 null map = 0%,reduce = 0%
2014-07-09 05:36:48,226 null map = 100%,reduce = 0%
完成的Job = job_local_0001
执行成功完成
Mapred本地任务成功。将Join加入MapJoin
OK
my_id my_genre my_price
11电脑44
44幻想5
所用时间:8.541秒,提取:2行
添加更多信息根据问题所有者的要求:
I'm trying to load XML data into Hive but I'm getting an error :
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":""}
The xml file i have used is :
<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book>
<id>11</id>
<genre>Computer</genre>
<price>44</price>
</book>
<book>
<id>44</id>
<genre>Fantasy</genre>
<price>5</price>
</book>
</catalog>
The hive query i have used is :
1) Create TABLE xmltable(xmldata string) STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH '/home/user/xmlfile.xml' OVERWRITE INTO TABLE xmltable;
2) CREATE VIEW xmlview (id,genre,price)
AS SELECT
xpath(xmldata, '/catalog[1]/book[1]/id'),
xpath(xmldata, '/catalog[1]/book[1]/genre'),
xpath(xmldata, '/catalog[1]/book[1]/price')
FROM xmltable;
3) CREATE TABLE xmlfinal AS SELECT * FROM xmlview;
4) SELECT * FROM xmlfinal WHERE id ='11
Till 2nd query everything is fine but when i executed the 3rd query it's giving me error:
The error is as below:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
at org.apache.hadoop.hive.ql.exec
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
So where it's going wrong? Also I'm using the proper xml file.
Thanks, Shree
Find Jar here -- > Brickhouse ,
sample example here --> Example
similar example in stackoverflow - here
Solution:
--Load xml data to table
DROP table xmltable;
Create TABLE xmltable(xmldata string) STORED AS TEXTFILE;
LOAD DATA lOCAL INPATH '/home/vijay/data-input.xml' OVERWRITE INTO TABLE xmltable;
-- check contents
SELECT * from xmltable;
-- create view
Drop view MyxmlView;
CREATE VIEW MyxmlView(id, genre, price) AS
SELECT
xpath(xmldata, 'catalog/book/id/text()'),
xpath(xmldata, 'catalog/book/genre/text()'),
xpath(xmldata, 'catalog/book/price/text()')
FROM xmltable;
-- check view
SELECT id, genre,price FROM MyxmlView;
ADD jar /home/vijay/brickhouse-0.7.0-SNAPSHOT.jar; --Add brickhouse jar
CREATE TEMPORARY FUNCTION array_index AS 'brickhouse.udf.collect.ArrayIndexUDF';
CREATE TEMPORARY FUNCTION numeric_range AS 'brickhouse.udf.collect.NumericRange';
SELECT
array_index( id, n ) as my_id,
array_index( genre, n ) as my_genre,
array_index( price, n ) as my_price
from MyxmlView
lateral view numeric_range( size( id )) MyxmlView as n;
Output:
hive > SELECT
> array_index( id, n ) as my_id,
> array_index( genre, n ) as my_genre,
> array_index( price, n ) as my_price
> from MyxmlView
> lateral view numeric_range( size( id )) MyxmlView as n;
Automatically selecting local only mode for query
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Execution log at: /tmp/vijay/.log
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2014-07-09 05:36:45,220 null map = 0%, reduce = 0%
2014-07-09 05:36:48,226 null map = 100%, reduce = 0%
Ended Job = job_local_0001
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
OK
my_id my_genre my_price
11 Computer 44
44 Fantasy 5
Time taken: 8.541 seconds, Fetched: 2 row(s)
Adding-more-info as requested by Question owner:
这篇关于将xml数据加载到配置单元表中:org.apache.hadoop.hive.ql.metadata.HiveException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!