无法使用PDI步骤连接到HDFS [英] Unable to connect to HDFS using PDI step

查看:155
本文介绍了无法使用PDI步骤连接到HDFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经从 Windows 8 系统在 Ubuntu 14.04 VM 中成功配置了 Hadoop 2.4
Hadoop安装工作完全正常,我也可以从Windows浏览器查看Namenode。所附图片如下:

I have successfully configured Hadoop 2.4 in an Ubuntu 14.04 VM from a Windows 8 system. Hadoop installation is working absolutely fine and also i am able to view the Namenode from my windows browser. Attached Image Below:

因此,我的主机名是:ubuntu和hdfs端口:9000(如果我输入错了,请更正我)。

So, my host name is : ubuntu and hdfs port : 9000 (correct me if I am wrong).

Core-site.xml:

Core-site.xml :

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://ubuntu:9000</value>
</property>

问题是从我的Pentaho数据集成工具连接到HDFS时出现的。附件图片如下。
PDI版本:4.4.0
使用的步骤:Hadoop复制文件

The issue is while connecting to HDFS from my Pentaho Data Integration Tool. Attached Image Below. PDI version: 4.4.0 Step Used: Hadoop Copy Files

请帮助我使用PDI连接HDFS。我需要为此安装或更新任何jar吗?如果您需要更多信息,请告诉我。

Please kindly help me in connecting to HDFS using PDI. Do i need to install or update any jar for this ?? Please let me know in case you need more information.

推荐答案

PDI 4.4 afaik不支持Hadoop 2.4。无论如何,文件中都有一个属性,您必须将其设置为使用特定的Hadoop配置(您可能会在论坛中看到 Hadoop配置,称为垫片,等等)。在data-integration / plugins / pentaho-big-data-plugin / plugin.properties文件中,有一个名为active.hadoop.configuration的属性,默认情况下将其设置为 hadoop-20,它表示Apache Hadoop 0.20。 x分布。您将需要将其设置为Pentaho随附的最新发行版,或按照我的博客文章中的描述构建自己的填充程序:

PDI 4.4 afaik doesn't have support for Hadoop 2.4. In any case, there is a property in a file you must set to use a particular Hadoop configuration (you may see "Hadoop configuration" referred to as a "shim" in the forums, etc.). In the data-integration/plugins/pentaho-big-data-plugin/plugin.properties file there is a property called active.hadoop.configuration, it is set by default to "hadoop-20" which refers to an Apache Hadoop 0.20.x distribution. You will want to set it to the "newest" distro that comes with Pentaho, or build your own shim as described in my blog post:

http://funpdi.blogspot.com/2013/03/pentaho-data -integration-44-and-hadoop.html

即将发布的PDI版本(5.2+)将支持包括Hadoop 2.4+的供应商发行版,因此请保持注意PDI市场和pentaho.com:)

Upcoming versions (5.2+) of PDI will support vendor distributions that include Hadoop 2.4+, so keep your eye out on the PDI Marketplace and on pentaho.com :)

这篇关于无法使用PDI步骤连接到HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆