与复制的本地数据到HDFS使用Amazon EC2 / S3 Hadoop集群上的问题 [英] Problem with copying local data onto HDFS on a Hadoop cluster using Amazon EC2/ S3

查看：422 发布时间：2015/12/1 10:15:20 amazon-s3 amazon-ec2 hadoop cloud hdfs

本文介绍了与复制的本地数据到HDFS使用Amazon EC2 / S3 Hadoop集群上的问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经安装包含在Amazon EC2上5个节点的Hadoop集群。现在，当我登录到主节点，并提交以下命令

I have setup a Hadoop cluster containing 5 nodes on Amazon EC2. Now, when i login into the Master node and submit the following command

bin/hadoop jar <program>.jar <arg1> <arg2> <path/to/input/file/on/S3>

这引发以下错误的第一个错误被抛出时，我不取代％2F，第二斜线被抛出时，我用％2F替换它们（而不是在同一时间。）：

It throws the following errors (not at the same time.) The first error is thrown when i don't replace the slashes with '%2F' and the second is thrown when i replace them with '%2F':

1) Java.lang.IllegalArgumentException: Invalid hostname in URI S3://<ID>:<SECRETKEY>@<BUCKET>/<path-to-inputfile>
2) org.apache.hadoop.fs.S3.S3Exception: org.jets3t.service.S3ServiceException: S3 PUT failed for '/' XML Error Message: The request signature we calculated does not match the signature you provided. check your key and signing method.

请注意：

1），当我提交了JPS，看看有什么任务都在主服务器上运行，它只是显示

1)when i submitted jps to see what tasks were running on the Master, it just showed

1116 NameNode
1699 Jps
1180 JobTracker

留下的DataNode和的TaskTracker。

leaving DataNode and TaskTracker.

2）我的秘密密钥包含两个'/'（斜杠）。和我一起％2F在S3 URI取代它们。

2)My Secret key contains two '/' (forward slashes). And i replace them with '%2F' in the S3 URI.

PS：在一个节点上运行时，程序正常运行在EC2上。它只有当我启动群集，我碰上涉及到将数据复制到/从S3自/至HDFS的问题。而且，什么DistCp使用呢？我是否需要分发的数据，即使我从S3将数据复制到HDFS？（我想，HDFS注意到了照顾内部）

PS: The program runs fine on EC2 when run on a single node. Its only when i launch a cluster, i run into issues related to copying data to/from S3 from/to HDFS. And, what does distcp do? Do i need to distribute the data even after i copy the data from S3 to HDFS?(I thought, HDFS took care of that internally)

如果你可以直接我一个解释运行地图的链接/减少使用Amazon EC2 / S3 Hadoop集群上的程序。这将是巨大的。

IF you could direct me to a link that explains running Map/reduce programs on a hadoop cluster using Amazon EC2/S3. That would be great.

问候，

迪帕克。

与复制的本地数据到HDFS使用Amazon EC2 / S3 Hadoop集群上的问题 [英] Problem with copying local data onto HDFS on a Hadoop cluster using Amazon EC2/ S3

问题描述

推荐答案

相关文章

云存储最新文章

热门教程

热门工具

登录关闭

与复制的本地数据到HDFS使用Amazon EC2 / S3 Hadoop集群上的问题 [英] Problem with copying local data onto HDFS on a Hadoop cluster using Amazon EC2/ S3

问题描述

推荐答案

相关文章

云存储最新文章

热门教程

热门工具

登录 关闭

登录关闭