Pig:Hadoop作业失败 [英] Pig: Hadoop jobs Fail
问题描述
我有一个Pig脚本,可以从csv文件中查询数据.
I have a pig script that queries data from a csv file.
该脚本已通过本地和小型.csv文件进行了测试.
The script has been tested locally with small and large .csv files.
在小型集群中: 它从处理脚本开始,直到完成40%的调用后失败
In Small Cluster: It starts with processing the scripts, and fails after completing 40% of the call
错误是,
Failed to read data from "path to file"
我的推断是,脚本可以读取文件,但是有一些连接断开,消息丢失
What I infer is that, The script could read the file, but there is some connection drop, a message lose
但是我只能得到上面提到的错误.
But I get the above mentioned error only.
推荐答案
常见问题的答案是更改配置文件中的错误级别,并将这两行添加到mapred-site.xml
An answer for the General Problem would be changing the errors levels in the Configuration Files, adding these two lines to mapred-site.xml
log4j.logger.org.apache.hadoop = error,A
log4j.logger.org.apache.pig= error,A
就我而言,它是内存不足异常
In my case, it aas an OutOfMemory Exception
这篇关于Pig:Hadoop作业失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!