当网络恢复时,UnknowHostException无法恢复,但是重新启动JVM解决了该问题 [英] UnknowHostException cannot recover when network is back but restart JVM solved it

查看:656
本文介绍了当网络恢复时,UnknowHostException无法恢复,但是重新启动JVM解决了该问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从某个时间点开始,我们的JVM(实际上是一个Yarn NodeManager)开始报告UnknownHostException; 通过JVM代码报告

From one timepoint, our JVM(In fact a Yarn NodeManager) start to report UnknownHostException; It is reported by JVM code

return InetAddress.getByName(host);

在接下来的两天内,该异常始终存在;在报告此错误期间,我进行了以下测试:

for the next more than 2 days, the exception always exists; During the time it is reporting this error, I do the following test:

  1. 在错误发生期间,ping操作可能会成功并获取IP地址(非常奇怪);
  2. 在错误期间,我编写了一个简单的测试用例来检查主机名解析,它也可以成功:
  3. 重新启动JVM之后,错误消失了;

这是我用于测试的代码:

This is the code I used for test:

public class Main {
  public static void main(String[] args){
    InetSocketAddress addr = NetUtils.createSocketAddr("host-name:8020");
    System.out.println(addr.isUnresolved());
  }
}




# NetUtils is a YARN class which simply call the InetAddress.getByName()
    public static InetSocketAddress createSocketAddrForHost(String host, int port) {
        String staticHost = getStaticResolution(host);
        String resolveHost = (staticHost != null) ? staticHost : host;

        InetSocketAddress addr;
        try {
          InetAddress iaddr = SecurityUtil.getByName(resolveHost);
          // if there is a static entry for the host, make the returned
          // address look like the original given host
          if (staticHost != null) {
            iaddr = InetAddress.getByAddress(host, iaddr.getAddress());
          }
          addr = new InetSocketAddress(iaddr, port);
        } catch (UnknownHostException e) {
          addr = InetSocketAddress.createUnresolved(host, port);
        }
        return addr;
      }

我们很久没有更改/etc/hosts了;

We haven't change the /etc/hosts for a long time;

ENV: JDK:Java版本"1.8.0_121" 操作系统:

ENVs: JDK: java version "1.8.0_121" OS:

Distributor ID: Ubuntu
Description:    Ubuntu 14.04.5 LTS
Release:    14.04
Codename:   trusty

我认为,在开始出现错误的时间点上,是的,网络存在一些问题.但是奇怪的是:

I believe that in the timepoint when the error start to occur, yes, the network has some problem. But what is weird is that:

  1. 为什么网络恢复后无法恢复(例如,当我 找到此错误,然后进行一些测试和ping操作).实际上是网络 问题仅发生了30分钟,但是JVM仍然报告了这些问题 错误;
  2. 为什么重新启动JVM后问题仍然存在?
  1. why it cannot recover after the network is back(For example, when I find this error and do some test and ping). In fact the network problem happened for only 30 minutes, but the JVM still report these error;
  2. why the problem is gone after I restart the JVM?

我检查了JVM配置,networkaddress.cache.ttlnetworkaddress.cache.negative.ttl都是默认值;因此,当我们找到未解析的主机名时,我应该重试,并且在网络恢复后应该成功;

I checked the JVM configuration , the networkaddress.cache.ttl and networkaddress.cache.negative.ttl are all default value; So, when we find the unresolved hostname, I should retry and it should succeed after the network is back;

推荐答案

您所描述的声音听起来像是JVM缓存了主机名查找.

What you are describing sounds like the JVM cached a hostname lookup.

来自 InetAddress的Javadoc :

默认情况下,安装安全管理器时,为了防止DNS欺骗攻击,将永久缓存正主机名解析的结果.

networkaddress.cache.ttl 将一次查找主机名,并在JVM的生命周期内无限期地缓存结果.尝试将其设置为非默认值-例如,将lokoups缓存10秒钟,将其设置为"10".

The default value for networkaddress.cache.ttl will look up a hostname one time and cache that result indefinitely for the life of the JVM. Try setting it to something non-default – for example, to cache lokoups for 10 seconds, set it "10".

这是网络属性:

networkaddress.cache.ttl

在java.security中指定,用于指示从名称服务成功进行名称查找的缓存策略.该值被指定为整数,以表示缓存成功查找的秒数.

Specified in java.security to indicate the caching policy for successful name lookups from the name service. The value is specified as integer to indicate the number of seconds to cache the successful lookup.

值为-1表示永远缓存".默认行为是在安装安全管理器时永久缓存,并在未安装安全管理器时缓存实现特定时间段.

A value of -1 indicates "cache forever". The default behavior is to cache forever when a security manager is installed, and to cache for an implementation specific period of time, when a security manager is not installed.

networkaddress.cache.negative.ttl的默认值为10,但我怀疑这不会影响您的应用程序行为.

The default for networkaddress.cache.negative.ttl is 10, but I suspect that isn't affecting your application behavior.

这篇关于当网络恢复时,UnknowHostException无法恢复,但是重新启动JVM解决了该问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆