Hadoop FileSystem.getFS()暂停大约2分钟 [英] Hadoop FileSystem.getFS() pauses for about 2 minutes

查看:248
本文介绍了Hadoop FileSystem.getFS()暂停大约2分钟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很奇怪的问题。我正在使用dfs-datastores Pail抽象将数据写入Java中的HDFS。我不认为Pail片断对于这个问题很重要。



当它调用org.apache.hadoop.fs.FileSystem时getFS(java.lang.String path )与我的本地文件系统上的路径暂停大约2分钟,似乎什么也没做,然后返回。这是在我的笔记本电脑上。



奇怪的是,当我今天在办公室的网络中时,它工作得非常快,但现在我回到家了它再次。我使用Java 1.7运行Ubuntu 10.10 64位。



任何人有任何想法它在做什么?在工作和在家之间有什么不同?

更新:
我一直在使用调试器代码,它似乎有在Configuration.loadResource()中出现问题。它会多次调用,并且需要5-10秒钟才能从该函数返回。



UPDATE2:
我进一步缩小了这个范围。最大的挂断似乎是它调用KerberosName.setConfiguration()时。这可以解释为什么Active Directory作为Kerberos服务器运行起来很快。我家里没有一个人,所以找不到一个。现在他们质疑的是,为什么在这个世界上它试图加载Java Kerberos的东西。

解决方案

我找到了一个解决方案(或者至少一项工作)。我安装了krb5-kdc软件包,现在我的小程序运行得很快,没有任何无法解释的暂停。在此之后,我删除了krb5-kdc,并进行了测试,结果仍然运行得很快。我删除了/etc/krb5.conf,并开始再次暂停。它看起来像在Ubuntu上使用Hadoop库(至少)需要一个/etc/krb5.conf文件。



也许这会帮助其他人。


I'm having a very strange problem. I'm using dfs-datastores Pail abstraction to write data to HDFS in Java. I don't think the Pail piece is important to the problem though.

When it calls org.apache.hadoop.fs.FileSystem getFS(java.lang.String path) with a path on my local filesystem it pauses for about 2 minutes seemingly doing nothing then returns. This is on my laptop.

The weird thing is that it worked really fast when I was on the network at my office today, but now that I'm home it's doing it again. I'm running Ubuntu 10.10 64-bit with Java 1.7.

Anyone have any ideas what it's doing? What could be different between being at work and being at home?

UPDATE: I've been stepping through code with the debugger and it seems to be having trouble in Configuration.loadResource(). It's calling that multiple times and it will take 5-10 seconds to return from that function.

UPDATE2: I've narrowed this down a little further. The biggest hang up seems to be when it calls KerberosName.setConfiguration(). Which would explain why it runs fast at work since the Active Directory acts as a Kerberos server. I don't have one here at home, so it can't find one. Now they question is why in the world it's trying to load the Java Kerberos stuff.

解决方案

I found a solution (or at least a work around). I installed the krb5-kdc package and now my little program runs fast without any unexplained pauses. After this I removed krb5-kdc, tested and it was still running fast. I removed /etc/krb5.conf and it started doing the pause again. It looks like using the Hadoop library on Ubuntu (at least) requires a /etc/krb5.conf file.

Maybe this will help someone else.

这篇关于Hadoop FileSystem.getFS()暂停大约2分钟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆