通过Docker容器进行Akka.net远程处理:客户端随机无法连接到主机 [英] Akka.net remoting over Docker containers: client randomly fails to connect to host

查看:144
本文介绍了通过Docker容器进行Akka.net远程处理:客户端随机无法连接到主机的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个带有TestActor的简单主机,它只将接收到的字符串写到控制台:

There is a simple host with a TestActor that only writes a string it receives to the console:

using (var actorSystem = ActorSystem.Create("host", HoconLoader.FromFile("config.hocon")))
{
    var testActor = actorSystem.ActorOf(Props.Create<TestActor>(), "TestActor");

    Console.WriteLine($"Waiting for requests...");

    while (true)
    {
        Task.Delay(1000).Wait();
    }
}

另一方面,有一个简单的客户端,它选择远程角色并向其传递TestMessage,然后等待一个没有指定超时的询问.

On the other side there is a simple client that selects the remote actor and passes a TestMessage to it, then waits on an ask without a timeout specified.

using (var actorSystem = ActorSystem.Create("client", HoconLoader.FromFile("config.hocon")))
{
    var testActor = actorSystem.ActorSelection("akka.tcp://host@host:8081/user/TestActor");

    Console.WriteLine($"Sending message...");

    testActor.Ask(new TestMessage($"Message")).Wait();

    Console.WriteLine($"Message ACKed.");
}

客户端和主机部署在两个Docker容器( docker-compose )上,它们的网络配置如下( docker network inspect ... ):

The client and the host are deployed on two Docker containers (docker-compose), whose network configuration is as follows (docker network inspect ...):

[
    {
        "Name": "akkaremotetest_default",
        "Id": "4995d7e340e09e4babcca7dc02ddf4f68f70761746c1246d66eaf7ee40ccec89",
        "Created": "2018-07-21T07:55:39.3534215Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.19.0.0/16",
                    "Gateway": "172.19.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "6040c260c5195d2fe350bf3c89b5f9ede8a65d44da6adb48817fbef266a99e07": {
                "Name": "akkaremotetest_host_1",
                "EndpointID": "a6220a6fee071a29b83e30f9aeb9b9e7ec5008f04f593ff3fb2464477a7e54aa",
                "MacAddress": "02:42:ac:13:00:02",
                "IPv4Address": "172.19.0.2/16",
                "IPv6Address": ""
            },
            "a97078c28c7d221c2c9af948fe36b72590251be69e06d0e66eafd2c74f416037": {
                "Name": "akkaremotetest_client_1",
                "EndpointID": "39bcb8b1047ad666d9c568ee968602b3a93edb4ac2151ba9c3f3c02359ef84f2",
                "MacAddress": "02:42:ac:13:00:03",
                "IPv4Address": "172.19.0.3/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

启动容器时,结果为以下之一:

  • 客户端成功执行Ask,Actor将收到的消息写入控制台,然后客户端确认成功
  • 客户端永远挂起,actor从不接收消息,不会发生超时.
  • the client succeeds with the Ask, the actor writes received message to the console, and the client confirms success,
  • the client hangs forever, the actor never receives the message, timeout does not occur.

问题在于后者通常发生在大多数时间,但仅当主机和客户端部署在Docker容器上时才发生.独立运行时,没有通信问题.

The problem is that the latter happens most of the time, but only when the host and the client are deployed on Docker containers. When run independently, there are no communication issues.

我想我尝试了所有没有结果的事情,而且我不知道我还能做些什么来调查为什么客户端的Ask可以永远持续下去,而这两个参与者系统中的任何一个都没有记录错误.

I think I tried everything without results, and I don't know what else I could do to investigate why the Ask of the client lasts forever, with no errors logged by any of these two actor systems.

这是Docker配置(yml):

Here is the Docker configuration (yml):

version: '2'

services:

  host:
    ports:
      - 8081:8081
    build:
      context: .
      dockerfile: Dockerfile
      args:
        PROJECT_DIR: Host
        PROJECT_NAME: Host
        WAIT_FOR_HOST: 0
    restart: on-failure

  client:
    depends_on:
      - host
    ports:
      - 8082:8082
    build:
      context: .
      dockerfile: Dockerfile
      args:
        PROJECT_DIR: Client
        PROJECT_NAME: Client
        WAIT_FOR_HOST: 1
    restart: on-failure

  tcpdump:
    image: kaazing/tcpdump
    network_mode: "host"
    volumes:
      - ./tcpdump:/tcpdump

这是客户端系统的配置(config.hocon):

Here is the configuration of the client system (config.hocon):

akka {     
    actor {
        provider = remote
    }

    remote {
        dot-netty.tcp {
            enabled-transports = ["akka.remote.netty.tcp"]
            hostname = client
            port = 8082
        }
    }

    stdout-loglevel = DEBUG
    loglevel = DEBUG
    log-config-on-start = on        

    actor {      
        creation-timeout = 20s  
        debug {  
              receive = on 
              autoreceive = on
              lifecycle = on
              event-stream = on
              unhandled = on
              fsm = on
              event-stream = on
              log-sent-messages = on
              log-received-messages = on
              router-misconfiguration = on
        }
    }
}

这是主机系统的配置(config.hocon):

Here is the configuration of the host system (config.hocon):

akka {     
    actor {
        provider = remote
    }

    remote {
        dot-netty.tcp {
            enabled-transports = ["akka.remote.netty.tcp"]
            hostname = host
            port = 8081
        }
    }

    stdout-loglevel = DEBUG
    loglevel = DEBUG
    log-config-on-start = on        

    actor {        
        creation-timeout = 20s  
        debug {  
              receive = on 
              autoreceive = on
              lifecycle = on
              event-stream = on
              unhandled = on
              fsm = on
              event-stream = on
              log-sent-messages = on
              log-received-messages = on
              router-misconfiguration = on
        }
    }
}

遵循有关 Akka远程配置的文档,我试图像这样更改客户端配置:

Following the documentation concerning Akka remote configuration, I attempted to change the client configuration like this:

remote {
    dot-netty.tcp {
        enabled-transports = ["akka.remote.netty.tcp"]

        hostname = 172.19.0.3
        port = 8082

        bind-hostname = client
        bind-port = 8082 
    }
}

和类似的主机配置:

remote {
    dot-netty.tcp {
        enabled-transports = ["akka.remote.netty.tcp"]

        hostname = 172.19.0.2
        port = 8081

        bind-hostname = host
        bind-port = 8081 
    }
}

演员选择也略有改变:

var testActor = actorSystem.ActorSelection("akka.tcp://host@172.19.0.2:8081/user/TestActor");

不幸的是,这根本没有帮助(什么都没有改变).

Unfortunately this has not helped at all (nothing has changed).

在此过程中生成的日志中,主机系统生成了一个关键条目.仅当出现时,通信成功(但通常不成功):

In the logs that are generated during the process, there is a crucial entry that is generated by the host system. Only when it appears, the communication is successful (but most often it does not):

[DEBUG][07/21/2018 09:42:50][Thread 0006][remoting] Associated [akka.tcp://host@host:8081] <- akka.tcp://client@client:8082

任何帮助将不胜感激.谢谢!

Any help will be appreciated. Thank you!

-编辑-

我在yml中添加了 tcpdump 部分,并在Wireshark中打开了生成的转储文件.我还添加了5秒的超时时间来等待询问.我很难解释结果,但是这是我在一次失败的连接尝试中得到的结果:

I added the tcpdump section to yml and opened the generated dump file in Wireshark. I also added a 5-second timeout to waiting on ask. It is hard for me to interpret the results, but here is what I got on a failed connection attempt:

172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK

172.19.0.3 -> 172.19.0.2: ACK

[a 5-second period of silence (waiting till timeout)]

172.19.0.3 -> 172.19.0.2: FIN, ACK

172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: FIN, ACK

172.19.0.3 -> 172.19.0.2: ACK

连接成功后会发生以下情况:

and here is what happens when connection succeeds:

172.19.0.3 -> 172.19.0.2: SYN
172.19.0.2 -> 172.19.0.3: SYN, ACK

172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK

172.19.0.2 -> 172.19.0.3: ACK
172.19.0.2 -> 172.19.0.3: PSH, ACK

172.19.0.3 -> 172.19.0.2: ACK
172.19.0.3 -> 172.19.0.2: PSH, ACK

版本:

  • Akka.NET 1.3.8
  • .NET Core 2.1.1
  • Docker 18.03.1-ce,内部版本9ee9f40
  • Docker-compose 1.21.1,内部版本7641a569

推荐答案

事实证明,问题出在项目依赖于.NET Core 2.1的事实,根据

It turns out that the issue stems from the fact that the projects are dependent on .NET Core 2.1 which Akka does not support yet according to this:

我们尚未正式支持.NET Core 2.1.哎呀,我们什至没有 netstandard 2.0还没有(尽管正在进行中).但是谢谢 确认确实存在问题:)

We don't officially support .NET Core 2.1 yet. Heck, we aren't even on netstandard 2.0 yet (although work is underway). But thanks for confirming that there are indeed issues :)

切换到.NET Core 2.0后,我不再能够重现所描述的问题.

After switching to .NET Core 2.0, I can no longer reproduce described issue.

这篇关于通过Docker容器进行Akka.net远程处理:客户端随机无法连接到主机的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆