从MapReduce写入Hive(初始化HCatOutputFormat) [英] Writing to Hive from MapReduce (initialize HCatOutputFormat)
本文介绍了从MapReduce写入Hive(初始化HCatOutputFormat)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我写了MR脚本,它应该从HBase加载数据并将它们转储到Hive中。连接到HBase是好的,但是当我尝试将数据保存到HIVE表中时,出现以下错误消息:
失败的Oozie启动器,Main类[org.apache.oozie.action.hadoop.JavaMain],main()抛出异常,org.apache.hive.hcatalog.common.HCatException:2004:HCatOutputFormat未初始化,setOutput必须被调用
org.apache.oozie.action.hadoop.JavaMainException:org.apache.hive.hcatalog.common.HCatException:2004:HCatOutputFormat没有初始化,setOutput有被称为
。在org.apache.oozie.action.hadoop。 JavaMain.run(JavaMain.java:58)
位于org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38)
位于org.apache.oozie.action.hadoop。 JavaMain.main(JavaMain.java:36)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
。在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
在org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache .hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild $ 2.run(YarnChild.java:168)
at java.security.AccessController。 doPrivileged(本地方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)$在org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
b $ b引起的:org.apache.hive.hcatalog.common.HCatException:2004:HCatOutputFormat没有初始化,setOutput有在org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:118)
org.apache.hive.hcatalog.mapre处被称为
duce.HCatBaseOutputFormat.getTableSchema(HCatBaseOutputFormat.java:61)维持在org.apache.hadoop.util.ToolRunner com.nrholding.t0_mr.main.DumpProductViewsAggHive.run(DumpProductViewsAggHive.java:254)
。运行(ToolRunner.java:70)
在com.nrholding.t0_mr.main.DumpProductViewsAggHive.main(DumpProductViewsAggHive.java:268)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
。在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
在java.lang.reflect.Method.invoke(
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:55)
... 15 more
我正在检查:
- 表存在 li>
- 在getTableSchema和setSchema之前调用setOutput方法
以下是我的run方法:
@Overrid e
public int run(String [] args)throws Exception {
//创建配置
Configuration conf = this.getConf();
String databaseName = null;
String tableName =test;
//解析参数
String [] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
getParams(otherArgs);
//最好在CLI参数中指定zookeeper quorum -D hbase.zookeeper.quorum = zookeeper服务器
conf.set(hbase.zookeeper.quorum,
cz-dc1-s-132.mall.local,cz-dc1-s-133.mall.local,
+cz-dc1-s-134.mall.local,cz-dc1-s- 135.mall.local,
+cz-dc1-s-136.mall.local);
//创建作业
作业作业= Job.getInstance(conf,NAME);
job.setJarByClass(DumpProductViewsAggHive.class);
//设置MapReduce作业
job.setReducerClass(Reducer.class);
//job.setNumReduceTasks(0); //如果不需要reducer
//指定键/值
job.setOutputKeyClass(Writable.class);
job.setOutputValueClass(DefaultHCatRecord.class);
//输入
getInput(null,dateFrom,dateTo,job,caching,table);
//输出
//忽略减速器输出的关键;发送HCatalog记录作为值
job.setOutputFormatClass(HCatOutputFormat.class);
HCatOutputFormat.setOutput(job,OutputJobInfo.create(databaseName,tableName,null));
HCatSchema s = HCatOutputFormat.getTableSchema(conf);
System.err.println(INFO:输出模式显式设置为写入:+ s);
HCatOutputFormat.setSchema(job,s);
//执行作业并返回状态
return job.waitForCompletion(true)? 0:1;
}
你知道如何帮助我吗?谢谢!
解决方案
好吧,我使用了折旧方法:
HCatSchema s = HCatOutputFormat.getTableSchema(job);
绝对:
HCatSchema s = HCatOutputFormat.getTableSchema(conf);
接下来工作。
I wrote MR script which should load data from HBase and dump them into Hive. Connecting to HBase is ok, but when I try to save data into HIVE table, I get following error message:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main() threw exception, org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
org.apache.oozie.action.hadoop.JavaMainException: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:58)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38)
at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:118)
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getTableSchema(HCatBaseOutputFormat.java:61)
at com.nrholding.t0_mr.main.DumpProductViewsAggHive.run(DumpProductViewsAggHive.java:254)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.nrholding.t0_mr.main.DumpProductViewsAggHive.main(DumpProductViewsAggHive.java:268)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:55)
... 15 more
I was checking that:
- table exists
- setOutput method is called before getTableSchema and setSchema
Here is my run method:
@Override
public int run(String[] args) throws Exception {
// Create configuration
Configuration conf = this.getConf();
String databaseName = null;
String tableName = "test";
// Parse arguments
String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
getParams(otherArgs);
// It is better to specify zookeeper quorum in CLI parameter -D hbase.zookeeper.quorum=zookeeper servers
conf.set( "hbase.zookeeper.quorum",
"cz-dc1-s-132.mall.local,cz-dc1-s-133.mall.local,"
+ "cz-dc1-s-134.mall.local,cz-dc1-s-135.mall.local,"
+ "cz-dc1-s-136.mall.local");
// Create job
Job job = Job.getInstance(conf, NAME);
job.setJarByClass(DumpProductViewsAggHive.class);
// Setup MapReduce job
job.setReducerClass(Reducer.class);
//job.setNumReduceTasks(0); // If reducer is not needed
// Specify key / value
job.setOutputKeyClass(Writable.class);
job.setOutputValueClass(DefaultHCatRecord.class);
// Input
getInput(null, dateFrom, dateTo, job, caching, table);
// Output
// Ignore the key for the reducer output; emitting an HCatalog record as value
job.setOutputFormatClass(HCatOutputFormat.class);
HCatOutputFormat.setOutput(job, OutputJobInfo.create(databaseName, tableName, null));
HCatSchema s = HCatOutputFormat.getTableSchema(conf);
System.err.println("INFO: output schema explicitly set for writing:" + s);
HCatOutputFormat.setSchema(job, s);
// Execute job and return status
return job.waitForCompletion(true) ? 0 : 1;
}
Do you have any idea how to help me? Thank you!
解决方案
Ok, I used depreciated method:
HCatSchema s = HCatOutputFormat.getTableSchema(job);
Insted of:
HCatSchema s = HCatOutputFormat.getTableSchema(conf);
And it seams to work.
这篇关于从MapReduce写入Hive(初始化HCatOutputFormat)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文