将数据写入 Hadoop [英] Writing data to Hadoop

查看:29
本文介绍了将数据写入 Hadoop的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将数据从 Windows 框等外部来源写入 Hadoop (HDFS).现在我一直在将数据复制到 namenode 并使用 HDFS 的 put 命令将其摄取到集群中.在我浏览代码时,我没有看到用于执行此操作的 API.我希望有人能告诉我我错了,并且有一种简单的方法可以针对 HDFS 对外部客户端进行编码.

I need to write data in to Hadoop (HDFS) from external sources like a windows box. Right now I have been copying the data onto the namenode and using HDFS's put command to ingest it into the cluster. In my browsing of the code I didn't see an API for doing this. I am hoping someone can show me that I am wrong and there is an easy way to code external clients against HDFS.

推荐答案

安装 Cygwin,在本地安装 Hadoop(您只需要指向 NN 的二进制文件和配置——无需实际运行服务),运行 hadoop fs -copyFromLocal/path/to/localfile/hdfs/path/

Install Cygwin, install Hadoop locally (you just need the binary and configs that point at your NN -- no need to actually run the services), run hadoop fs -copyFromLocal /path/to/localfile /hdfs/path/

您还可以使用新的 Cloudera 桌面通过 Web UI 上传文件,但对于大文件来说这可能不是一个好的选择.

You can also use the new Cloudera desktop to upload a file via the web UI, though that might not be a good option for giant files.

还有一个用于 HDFS 的 WebDAV 覆盖,但我不知道它有多稳定/可靠.

There's also a WebDAV overlay for HDFS but I don't know how stable/reliable that is.

这篇关于将数据写入 Hadoop的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆