Apache Pig-错误6007:“无法检查名称”信息 [英] Apache Pig- ERROR 6007: "Unable to check name" message

查看:251
本文介绍了Apache Pig-错误6007:“无法检查名称”信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

环境:hadoop 1.0.3,hbase 0.94.1,pig 0.11.1



我在Java程序中运行一个猪脚本,有时会出现下面的错误但不是所有的时间。程序所做的是从hdfs加载文件,做一些转换并将其存储到hbase中。我的程序是多线程的。我已经使PigServer线程安全,并且在hdfs中创建了/ user / root目录。这是程序的片段和我得到的例外。请指教。

  pigServer = PigFactory.getServer(); 
URL path = getClass()。getClassLoader()。getResource(cfg / concatall.py);
LOG.info(CDNResolve2Hbase:从+ path.toString()读取concatall.py文件);
pigServer.getPigContext()。getProperties()。setProperty(PigContext.JOB_NAME,
CDNResolve2Hbase);
pigServer.registerQuery(A = load'+ inputPath +'使用PigStorage('\t')as(ip:chararray,do:chararray,cn:chararray,cdn:chararray,firsttime:chararray,更新时间:chararray););
pigServer.registerCode(path.toString(),jython,myfunc);
pigServer.registerQuery(B = foreach)生成myfunc.concatall('+ extractTimestamp(inputPath)+',ip,do,cn),cdn,SUBSTRING(firsttime,0,8););
outputTable =hbase://+ outputTable;
ExecJob job = pigServer.store(B,outputTable,org.apache.pig.backend.hadoop.hbase.HBaseStorage('d:cdn d:dtime'));

和我的PigFactory有以下代码

  private static ThreadLocal< PigServer> pigServer = new ThreadLocal< PigServer>(); 
public static PigServer getServer(){
if(pigServer.get()== null){
try
{printClassPath();属性prop = SystemUtils.getCfg(); pigServer.set(新的PigServer(ExecType.MAPREDUCE,prop));返回pigServer.get(); }
catch(Exception e)
{LOG.error(启动PigServer时出错:,e);返回null; }
}
return pigServer.get();
}




org.apache.pig.impl.logicalLayer .FrontendException:解析期间出错1000:错误。
无法在org.apache.pig.PigServer上检查名称hdfs:// DC-001:9000 / user / root
$ Graph.parseQuery(PigServer.java:1607)$ or $ $ $ $ org .apache.pig.PigServer $ Graph.registerQuery(PigServer.java:1546)
at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
at org.apache.pig.PigServer .registerQuery(PigServer.java:529)
at com.hugedata.cdnserver.datanalysis.CDNResolve2Hbase.execute(Unknown Source)
at com.hugedata.cdnserver.DatAnalysis.cdnResolve2Hbase(Unknown Source)
在com.hugedata.cdnserver.task.HandleDomainNameLogTask.execute(未知源)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本地方法)
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework .util.MethodInvoker.invoke(MethodInvoker.java:273)
at org .springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean $ MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:264)
at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
at org.quartz .core.JobRunShell.run(JobRunShell.java:203)
at org.quartz.simpl.SimpleThreadPool $ WorkerThread.run(SimpleThreadPool.java:520)



导致:无法解析:猪脚本解析失败:
猪脚本无法验证:org.apache.pig.backend.datastorage.DataStorageException:错误6007:无法检查名称hdfs:// DC-001 :9000 / user / root
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
at org.apache.pig.PigServer $ Graph.parseQuery(PigServer.java: 1599)

... 15 more
导致:
pig脚本无法验证:org.apache.pig.backend.datastorage.DataStorageException :错误6007:无法检查名称hdfs:// DC-001:9000 / user / root
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:835)
at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236)
在org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)
at org.apache .pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
at org.apache.pig.parser .QueryParserDriver.parse(QueryParserDriver.java:184)

... 16 more
导致:org.apache.pig.backend.datastorage.DataStorageException:错误6007:无法检查名称hdfs:// DC-001:9000 / user / root
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:207)
在org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement( HDataStorage.java:128)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:138)
at org.apache.pig.parser.QueryParserUtils.getCurrentDir( QueryParserUtils.java:91)
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:827)

... 22 more
导致:java.io.IOException:文件系统在org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264)
处关闭
,在org.apache.hadoop.hdfs处关闭
。 DFSClient.getFileInfo(DFSClient.java:873)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
at org.apache.hadoop.fs.FileSystem.exists(
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)


... 26 more




解决方案

你得到的错误暗示了你不使用与你服务器相同的hadoop客户端。
您可以检查本地安装的hadoop版本吗?


Environment: hadoop 1.0.3, hbase 0.94.1, pig 0.11.1

I am running a pig script in Java program, I get the following error sometimes but not all the time. What the program does is it loads a file from hdfs, do some transformation and store it into hbase. My program is multi-threaded. And I've already made PigServer thread-safe and I have "/user/root" directory created in hdfs. Here is the snippet of the program and the exception I've got. Please advise.

pigServer = PigFactory.getServer();
URL path = getClass().getClassLoader().getResource("cfg/concatall.py");  
LOG.info("CDNResolve2Hbase: reading concatall.py file from " + path.toString());
pigServer.getPigContext().getProperties().setProperty(PigContext.JOB_NAME,
"CDNResolve2Hbase");
pigServer.registerQuery("A = load '" + inputPath + "' using PigStorage('\t') as     (ip:chararray, do:chararray, cn:chararray, cdn:chararray, firsttime:chararray,     updatetime:chararray);");
pigServer.registerCode(path.toString(),"jython","myfunc");
pigServer.registerQuery("B = foreach A generate myfunc.concatall('"+ extractTimestamp (inputPath)+"',ip,do,cn), cdn, SUBSTRING(firsttime,0,8);");
outputTable = "hbase://" + outputTable;
ExecJob job = pigServer.store  ("B",outputTable,"org.apache.pig.backend.hadoop.hbase.HBaseStorage('d:cdn d:dtime')");

and my PigFactory has the following code

private static ThreadLocal<PigServer> pigServer = new ThreadLocal<PigServer>();
public static synchronized PigServer getServer() {
if (pigServer.get() == null) {
try
{ printClassPath(); Properties prop = SystemUtils.getCfg(); pigServer.set(new PigServer    (ExecType.MAPREDUCE, prop)); return pigServer.get(); }
catch (Exception e)
{ LOG.error("error in starting PigServer:", e); return null; }
}
return pigServer.get();
}

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) at org.apache.pig.PigServer.registerQuery(PigServer.java:516) at org.apache.pig.PigServer.registerQuery(PigServer.java:529) at com.hugedata.cdnserver.datanalysis.CDNResolve2Hbase.execute(Unknown Source) at com.hugedata.cdnserver.DatAnalysis.cdnResolve2Hbase(Unknown Source) at com.hugedata.cdnserver.task.HandleDomainNameLogTask.execute(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:273) at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:264) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at org.quartz.core.JobRunShell.run(JobRunShell.java:203) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)

Caused by: Failed to parse: Pig script failed to parse: pig script failed to validate: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)

... 15 more Caused by: pig script failed to validate: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:835) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

... 16 more Caused by: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:207) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:138) at org.apache.pig.parser.QueryParserUtils.getCurrentDir(QueryParserUtils.java:91) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:827)

... 22 more Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:873) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)

... 26 more

解决方案

The errors you're getting suggests that you're not using the same hadoop client as in you're server. can you check the hadoop version installed locally?

这篇关于Apache Pig-错误6007:“无法检查名称”信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆