从Windwos服务器读取文件/将文件写入HDFS [英] Reading/Writing files to HDFS from Windwos server

查看:169
本文介绍了从Windwos服务器读取文件/将文件写入HDFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从Windows服务器将文件写入HDFS. Hadoop集群位于Linux上. 我尝试在所有到处都可以使用"hadoop jar"运行的Java代码的地方进行研究

I want to write files to HDFS from windows server. Hadoop cluster is on Linux. I tried researching everywhere I got a java code that can be run using "hadoop jar"

有人可以帮助我了解如何运行HDFS文件从Windows编写Java代码吗? Windows box上需要什么?即使是正确的链接也可以.

Can somebody help me to understand how can I run HDFS file write java code from windows? What is required on Windows box? Even a proper link will do.

推荐答案

您只需要编写一个简单的Java程序并将其像普通的.jar文件一样运行.

You need only to code a simple java program and run it like a normal .jar file.

在项目中,您需要导入hadoop库.

In the project you need to import the hadoop library.

这是一个可行的示例maven项目(我在集群上对其进行了测试):

This is a working example maven project (I tested it on my cluster):

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;


public class WriteFileToHdfs {

    public static void main(String[] args) throws IOException, URISyntaxException {

        String dataNameLocation = "hdfs://[your-namenode-ip]:[the-port-where-hadoop-is-listening]/";

        Configuration configuration = new Configuration();
        FileSystem hdfs = FileSystem.get( new URI( dataNameLocation ), configuration );
        Path file = new Path(dataNameLocation+"/myFile.txt");

        FSDataOutputStream out = hdfs.create(file);
        out.writeUTF("Some text ...");
        out.close();

        hdfs.close();

    }

}

请记住将依赖项放到pom.xml中,并为主类建立清单文件的说明:

Remember to put the dependencies to your pom.xml and the instruction to build the manifest file for the main class:

<properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <maven.compiler.source>1.7</maven.compiler.source>
        <maven.compiler.target>1.7</maven.compiler.target>
        <mainClass>your.cool.package.WriteFileToHdfs</mainClass>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.6.1</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>            
          <plugin>
            <artifactId>maven-dependency-plugin</artifactId>
            <executions>
                <execution>
                    <phase>install</phase>
                    <goals>
                        <goal>copy-dependencies</goal>
                    </goals>
                    <configuration>
                        <outputDirectory>${project.build.directory}/lib</outputDirectory>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <artifactId>maven-jar-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <addClasspath>true</addClasspath>
                        <classpathPrefix>lib/</classpathPrefix>
                        <mainClass>${mainClass}</mainClass>
                    </manifest>
                </archive>
            </configuration>
        </plugin>
      </plugins>
    </build>

只需使用以下命令对程序进行午餐:

Just lunch the program with the command:

java -jar nameOfTheJarFile.jar

java -jar nameOfTheJarFile.jar

当然,您需要使用程序包名称和namenode ip地址来编辑代码.

Of course you need to edit the code with your package name and namenode ip address.

这篇关于从Windwos服务器读取文件/将文件写入HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆