Apache Pig-错误 6007:“无法检查名称"信息 [英] Apache Pig- ERROR 6007: "Unable to check name" message

查看:33
本文介绍了Apache Pig-错误 6007:“无法检查名称"信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

环境:hadoop 1.0.3、hbase 0.94.1、pig 0.11.1

我在 Java 程序中运行 Pig 脚本,有时会出现以下错误,但并非总是如此.该程序所做的是从 hdfs 加载文件,进行一些转换并将其存储到 hbase 中.我的程序是多线程的.而且我已经使 PigServer 线程安全,并且在 hdfs 中创建了/user/root"目录.这是程序的片段和我得到的例外.请指教.

pigServer = PigFactory.getServer();URL 路径 = getClass().getClassLoader().getResource("cfg/concatall.py");LOG.info("CDNResolve2Hbase: 从 " + path.toString() 读取 concatall.py 文件);pigServer.getPigContext().getProperties().setProperty(PigContext.JOB_NAME,"CDNResolve2Hbase");pigServer.registerQuery("A = load '" + inputPath + "' using PigStorage('\t') as (ip:chararray, do:chararray, cn:chararray, cdn:chararray, firsttime:chararray, updatetime:chararray);");pigServer.registerCode(path.toString(),"jython","myfunc");pigServer.registerQuery("B = foreach A generate myfunc.concatall('"+ extractTimestamp (inputPath)+"',ip,do,cn), cdn, SUBSTRING(firsttime,0,8);");outputTable = "hbase://" + outputTable;ExecJob job = pigServer.store("B",outputTable,"org.apache.pig.backend.hadoop.hbase.HBaseStorage('d:cdn d:dtime')");

我的 PigFactory 有以下代码

私有静态ThreadLocalpigServer = new ThreadLocal();公共静态同步 PigServer getServer() {if (pigServer.get() == null) {尝试{ 打印类路径();属性 prop = SystemUtils.getCfg();pigServer.set(new PigServer (ExecType.MAPREDUCE, prop));返回 pigServer.get();}捕获(例外 e){ LOG.error("启动 PigServer 时出错:", e);返回空;}}返回 pigServer.get();}

<块引用>

org.apache.pig.impl.logicalLayer.FrontendException:错误 1000:解析过程中出错.无法检查名称 hdfs://DC-001:9000/user/root在 org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)在 org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)在 org.apache.pig.PigServer.registerQuery(PigServer.java:516)在 org.apache.pig.PigServer.registerQuery(PigServer.java:529)在 com.hugedata.cdnserver.datanalysis.CDNResolve2Hbase.execute(来源不明)在 com.hugedata.cdnserver.DatAnalysis.cdnResolve2Hbase(来源不明)在 com.hugedata.cdnserver.task.HandleDomainNameLogTask.execute(Unknown Source)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)在 java.lang.reflect.Method.invoke(Method.java:597)在 org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:273)在 org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:264)在 org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)在 org.quartz.core.JobRunShell.run(JobRunShell.java:203)在 org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)

Caused by: Failed to parse: Pig 脚本无法解析:pig 脚本无法验证:org.apache.pig.backend.datastorage.DataStorageException:错误 6007:无法检查名称 hdfs://DC-001:9000/user/root在 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)在 org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)

……还有 15 个引起的:pig 脚本无法验证:org.apache.pig.backend.datastorage.DataStorageException:错误 6007:无法检查名称 hdfs://DC-001:9000/user/root在 org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:835)在 org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236)在 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315)在 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)在 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)在 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)在 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

……还有 16 个引起:org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root在 org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:207)在 org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128)在 org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:138)在 org.apache.pig.parser.QueryParserUtils.getCurrentDir(QueryParserUtils.java:91)在 org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:827)

……还有 22 个引起:java.io.IOException:文件系统关闭在 org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264)在 org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:873)在 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)在 org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768)在 org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)

<块引用>

……还有 26 个

解决方案

您收到的错误表明您使用的 hadoop 客户端与服务器中的不同.能查一下本地安装的hadoop版本吗?

Environment: hadoop 1.0.3, hbase 0.94.1, pig 0.11.1

I am running a pig script in Java program, I get the following error sometimes but not all the time. What the program does is it loads a file from hdfs, do some transformation and store it into hbase. My program is multi-threaded. And I've already made PigServer thread-safe and I have "/user/root" directory created in hdfs. Here is the snippet of the program and the exception I've got. Please advise.

pigServer = PigFactory.getServer();
URL path = getClass().getClassLoader().getResource("cfg/concatall.py");  
LOG.info("CDNResolve2Hbase: reading concatall.py file from " + path.toString());
pigServer.getPigContext().getProperties().setProperty(PigContext.JOB_NAME,
"CDNResolve2Hbase");
pigServer.registerQuery("A = load '" + inputPath + "' using PigStorage('\t') as     (ip:chararray, do:chararray, cn:chararray, cdn:chararray, firsttime:chararray,     updatetime:chararray);");
pigServer.registerCode(path.toString(),"jython","myfunc");
pigServer.registerQuery("B = foreach A generate myfunc.concatall('"+ extractTimestamp (inputPath)+"',ip,do,cn), cdn, SUBSTRING(firsttime,0,8);");
outputTable = "hbase://" + outputTable;
ExecJob job = pigServer.store  ("B",outputTable,"org.apache.pig.backend.hadoop.hbase.HBaseStorage('d:cdn d:dtime')");

and my PigFactory has the following code

private static ThreadLocal<PigServer> pigServer = new ThreadLocal<PigServer>();
public static synchronized PigServer getServer() {
if (pigServer.get() == null) {
try
{ printClassPath(); Properties prop = SystemUtils.getCfg(); pigServer.set(new PigServer    (ExecType.MAPREDUCE, prop)); return pigServer.get(); }
catch (Exception e)
{ LOG.error("error in starting PigServer:", e); return null; }
}
return pigServer.get();
}

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) at org.apache.pig.PigServer.registerQuery(PigServer.java:516) at org.apache.pig.PigServer.registerQuery(PigServer.java:529) at com.hugedata.cdnserver.datanalysis.CDNResolve2Hbase.execute(Unknown Source) at com.hugedata.cdnserver.DatAnalysis.cdnResolve2Hbase(Unknown Source) at com.hugedata.cdnserver.task.HandleDomainNameLogTask.execute(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:273) at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:264) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at org.quartz.core.JobRunShell.run(JobRunShell.java:203) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)

Caused by: Failed to parse: Pig script failed to parse: pig script failed to validate: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)

... 15 more Caused by: pig script failed to validate: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:835) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

... 16 more Caused by: org.apache.pig.backend.datastorage.DataStorageException: ERROR 6007: Unable to check name hdfs://DC-001:9000/user/root at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:207) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:138) at org.apache.pig.parser.QueryParserUtils.getCurrentDir(QueryParserUtils.java:91) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:827)

... 22 more Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:873) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)

... 26 more

解决方案

The errors you're getting suggests that you're not using the same hadoop client as in you're server. can you check the hadoop version installed locally?

这篇关于Apache Pig-错误 6007:“无法检查名称"信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆