阅读与Java远程HDFS文件 [英] Reading remote HDFS file with Java

查看：280 发布时间：2016/5/21 13:33:55 java linux apache hadoop hdfs

本文介绍了阅读与Java远程HDFS文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个有点麻烦，用一个简单的Hadoop的安装。我已经下载的Hadoop 2.4.0和安装在一台CentOS的Linux的节点（虚拟机）上。作为Apache的网站（<一个描述我与伪分布的单个节点的Hadoop配置href=\"http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html\">http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html).它开始在日志中没有问题，我可以读+写使用Hadoop的FS通过命令行命令。文件

I’m having a bit of trouble with a simple Hadoop install. I’ve downloaded hadoop 2.4.0 and installed on a single CentOS Linux node (Virtual Machine). I’ve configured hadoop for a single node with pseudo distribution as described on the apache site (http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html). It starts with no issues in the logs and I can read + write files using the "hadoop fs" commands from the command line.

我试图使用Java API来读取远程机器上从HDFS文件。本机可连接并列出目录的内容。它也可以判断某个文件与code存在：

I’m attempting to read a file from the HDFS on a remote machine with the Java API. The machine can connect and list directory contents. It can also determine if a file exists with the code:

Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
FileSystem fs = FileSystem.get(new Configuration());
System.out.println(p.getName() + " exists: " + fs.exists(p));

系统打印真，表示它的存在。然而，当我试图读取该文件：

The system prints "true" indicating it exists. However, when I attempt to read the file with:

BufferedReader br = null;
try {
    Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
    FileSystem fs = FileSystem.get(CONFIG);
    System.out.println(p.getName() + " exists: " + fs.exists(p));

    br=new BufferedReader(new InputStreamReader(fs.open(p)));
    String line = br.readLine();

    while (line != null) {
        System.out.println(line);
        line=br.readLine();
    }
}
finally {
    if(br != null) br.close();
}

这code抛出异常：

异常线程mainorg.apache.hadoop.hdfs.BlockMissingException：无法获得块：BP-13917963-127.0.0.1-1398476189167：blk_1073741831_1007文件=的/ usr /测试/ test_file里面。 TXT 的

Exception in thread "main" org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-13917963-127.0.0.1-1398476189167:blk_1073741831_1007 file=/usr/test/test_file.txt

谷歌搜索给了一些可能的提示，但所有签出。数据节点连接，活跃，有足够的空间。从HDFS dfsadmin -report管理报告显示：

Googling gave some possible tips but all checked out. The data node is connected, active, and has enough space. The admin report from hdfs dfsadmin –report shows:

配置的容量：52844687360（49.22 GB）

  present容量：48507940864（45.18 GB）

  DFS剩余：48507887616（45.18 GB）

  DFS使用：53248（52 KB）

  DFS用于％：0.00％

  在复制块：0

  块与腐败副本：0

  缺少块：0



  可用的Datanode：1（1总，无死亡）



  实况数据节点：

  名称：127.0.0.1:50010（test.server）

  主机名：test.server

  退役状态：正常

  配置的容量：52844687360（49.22 GB）

  DFS使用：53248（52 KB）

  非DFS使用：4336746496（4.04 GB）

  DFS剩余：48507887616（45.18 GB）

  DFS用于％：0.00％

  DFS剩余％：91.79％

  配置的缓存容量：0（0 B）

  使用缓存：0（0 B）

  缓存剩余：0（0 B）

  使用缓存％：100.00％

  高速缓存剩余％：0.00％

  最后的接触：周五4月25日22时十六分56秒PDT 2014年

Configured Capacity: 52844687360 (49.22 GB)
Present Capacity: 48507940864 (45.18 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used: 53248 (52 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (test.server)
Hostname: test.server
Decommission Status : Normal
Configured Capacity: 52844687360 (49.22 GB)
DFS Used: 53248 (52 KB)
Non DFS Used: 4336746496 (4.04 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.79%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Fri Apr 25 22:16:56 PDT 2014

客户端罐子直接从Hadoop的有复制安装所以没有版本不匹配。我可以浏览文件系统与我的Java类和读取文件属性。我不能没有得到异常读取文件内容。如果我尝试写与code文件：

The client jars were copied directly from the hadoop install so no version mismatch there. I can browse the file system with my Java class and read file attributes. I just can’t read the file contents without getting the exception. If I try to write a file with the code:

FileSystem fs = null;
BufferedWriter br = null;

System.setProperty("HADOOP_USER_NAME", "root");

try {
    fs = FileSystem.get(new Configuraion());

    //Path p = new Path(dir, file);
    Path p = new Path("hdfs://test.server:9000/usr/test/test.txt");
    br = new BufferedWriter(new OutputStreamWriter(fs.create(p,true)));
    br.write("Hello World");
}
finally {
    if(br != null) br.close();
    if(fs != null) fs.close();
}

这将创建文件，但不写任何字节，并抛出异常：

this creates the file but doesn’t write any bytes and throws the exception:

异常线程mainorg.apache.hadoop.ipc.RemoteException（java.io.IOException异常）：文件/usr/test/test.txt只能被复制到0节点，而不是minReplication（= 1）。有1个Datanode（S）运行和1个节点（S）被排除在此操作。的

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /usr/test/test.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

谷歌搜索这表明一个可能的空间问题，而是从dfsadmin报告，似乎有足够的空间。这是一个普通的安装，我不能让过去这个问题。

Googling for this indicated a possible space issue but from the dfsadmin report, it seems there is plenty of space. This is a plain vanilla install and I can’t get past this issue.

环境摘要：

服务器：

2.4.0的Hadoop与伪分布（<一个href=\"http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html\">http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html)

Hadoop 2.4.0 with pseudo-distribution (http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html)

CentOS的6.5虚拟机的64位服务器
的Java 1.7.0_55

CentOS 6.5 Virtual Machine 64 bit server Java 1.7.0_55

客户端：

Windows 8的（虚拟机）
的Java 1.7.0_51

Windows 8 (Virtual Machine) Java 1.7.0_51

任何帮助是极大AP preciated。

Any help is greatly appreciated.

阅读与Java远程HDFS文件 [英] Reading remote HDFS file with Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

阅读与Java远程HDFS文件 [英] Reading remote HDFS file with Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭