Erlang:分布式应用程序奇怪的行为 [英] Erlang: Distributed Application Strange behaviour

查看:130
本文介绍了Erlang:分布式应用程序奇怪的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



配置和想法取自:
http:/www.erlang.org/ doc / pdf / otp-system-documentation.pdf 9.9。分布式应用程序




  • 我们有3个节点:n1 @ a2-X201,n2 @ a2-X201,n3 @ a2-X201

  • 我们有一个应用程序 wd 做一些有用的工作:)



配置文件:




  • wd1.config - 第一个节点:



 
[{kernel,
[{distributed,[{wd,5000,['n1 @ a2-X201',{'n2 @ a2-X201 ','n3 @ a2-X201'}]}]},
{sync_nodes_mandatory,['n2 @ a2-X201','n3 @ a2-X201']},
{sync_nodes_timeout,5000}
]}
,{sasl,[
%%所有报告转到此文件
{sasl_error_logger,{file,/ tmp / wd_n1.log}}
]
}]。




  • wd2.config第二个:



 
[{kernel,
[{distributed,[{wd,5000,['n1 @ a2-X201',{'n2 @ a2-X201 ','n3 @ a2-X201'}]}]},
{sync_nodes_mandatory,['n1 @ a2-X201','n3 @ a2-X201']},
{sync_nodes_timeout,5000}
]
}
,{sasl,[
%%所有报告转到此文件
{sasl_error_logger,{file,/ tmp / wd_n2.log} }
]
}]。




  • 对于节点n3看起来相似。



    • 现在在3个不同的终端中启动erlang:




      • erl -sname n1 @ a2- X201 -config wd1 -pa $ WD_EBIN_PATH -boot start_sasl

      • erl -sname n2 @ a2-X201 -config wd2 -pa $ WD_EBIN_PATH -boot start_sasl

      • erl -sname n3 @ a2-X201 -config wd3 -pa $ WD_EBIN_PATH -boot start_sasl



      在每个erlang节点上启动应用程序:
      * application:start(wd)。

       
      (n1 @ a2-X201)1> application:start(wd)

      = INFO REPORT ==== 2011年6月19日:: 15:42:51 ===
      wd_plug_server开始... PluginId:4路径:/ home / a2 / src / erl / data / SIGFileMask:(?i)(。*)\\.SIG $
      ok



       
      (n2 @ a2-X201)1> application:start(wd)。
      ok
      (n2 @ a2-X201)2>



       
      (n3 @ a2-X201)1> application:start(wd)。
      ok
      (n3 @ a2-X201)2>

      此刻一切都OK。如Erlang文档中所述:应用程序正在节点运行 n1 @ a2-X201



      现在kill node n1
      应用程序已迁移到 n2

       
      (n2 @ a2-X201)2>
      = INFO REPORT ==== 2011年6月19日:: 15:46:28 ===
      wd_plug_server启动... PluginId:4路径:/ home / a2 / src / erl / data / SIG FileMask:(?i)(。*)\\.SIG $

      继续我们的游戏:杀死节点 n2
      一次系统工作正常。我们在节点 n3

       
      (n3 @ a2-X201)2>
      =信息报告==== 2011年6月19日:: 15:48:18 ===
      wd_plug_server启动... PluginId:4路径:/ home / a2 / src / erl / data / SIGFileMask :(?i)(。*)\\.SIG $

      现在还原节点 n1 n2
      so:

       
      Erlang R14B(erts-5.8.1)[source] [smp:4:4] [rq: 4] [async-threads:0] [hipe] [kernel-poll:false]

      Eshell V5.8.1(中止与^ G)
      (n1 @ a2-X201)1>

      Eshell V5.8.1(中止与^ G)
      (n2 @ a2-X201)1>

      节点 n1 n2 返回。

      看起来我现在必须手动重新启动应用程序:
      *我们来做节点 n2 首先:

       
      (n2 @ a2-X201)1> application:start(wd)




      • 看起来像挂起...

      • 现在重新启动在 n1



       
      (n1 @ a2-X201)1>应用程序:启动(WD)。

      = INFO REPORT ==== 2011年6月19日:: 15:55:43 ===
      wd_plug_server启动... PluginId:4路径:/ home / a2 / src / erl / data / SIGFileMask:(?i)(。*)\\.SIG $

      ok
      (n1 @ a2-X201)2>

      它可以工作。而节点 n2 也已返回OK:

       
      Eshell V5.8.1(中止与^ G)
      (n2 @ a2-X201)1> application:start(wd)。
      ok
      (n2 @ a2-X201)2>

      在节点 n3 我们看到:

       
      = INFO REPORT ==== 2011年6月19日:: 15:55:43 ===
      应用程序:wd
      退出:停止
      类型:临时

      一般来说,一切都看起来很好,文档,除了在节点 n2

      开始应用程序的延迟。



      现在再次杀死节点 n1



       
      (n1 @ a2-X201)2>
      用户切换命令
      - > q
      [a2 @ a2-X201发行] $

      Ops ...一切挂起。应用程序没有在另一个节点重新启动。



      实际上,当我在写这篇文章时,我意识到有时候所有的东西都OK,有时候我有一个问题。



      任何想法,在恢复主节点时可能会出现问题,再一次杀死它?

      解决方案

      作为在Learn You Some Erlang中解释(滚动分布式应用程序仅在作为发行版的一部分启动时才能正常工作,而不是当您以应用程序手动启动它们时:start


      I'm paying with distributed erlang applications.

      Configuration and ideas are taken from:
      http:/www.erlang.org/doc/pdf/otp-system-documentation.pdf 9.9. Distributed Applications

      • We have 3 nodes: n1@a2-X201, n2@a2-X201, n3@a2-X201
      • We have application wd that do some useful job :)

      Configuration files:

      • wd1.config - for the first node:

            [{kernel,
                [{distributed,[{wd,5000,['n1@a2-X201',{'n2@a2-X201','n3@a2-X201'}]}]},
                 {sync_nodes_mandatory,['n2@a2-X201','n3@a2-X201']},
                 {sync_nodes_timeout,5000}
              ]}
            ,{sasl, [
            %% All reports go to this file
            {sasl_error_logger,{file,"/tmp/wd_n1.log"}}
            ]
          }].
      

      • wd2.config for the second:

          [{kernel,
              [{distributed,[{wd,5000,['n1@a2-X201',{'n2@a2-X201','n3@a2-X201'}]}]},
               {sync_nodes_mandatory,['n1@a2-X201','n3@a2-X201']},
               {sync_nodes_timeout,5000}
               ]
           }
          ,{sasl, [
              %% All reports go to this file
              {sasl_error_logger,{file,"/tmp/wd_n2.log"}}
          ]
          }].
      
      

      • For the node n3 looks similar.

      Now start erlang in 3 separate terminals:

      • erl -sname n1@a2-X201 -config wd1 -pa $WD_EBIN_PATH -boot start_sasl
      • erl -sname n2@a2-X201 -config wd2 -pa $WD_EBIN_PATH -boot start_sasl
      • erl -sname n3@a2-X201 -config wd3 -pa $WD_EBIN_PATH -boot start_sasl

      Start application on each of erlang nodes: * application:start(wd).

      (n1@a2-X201)1> application:start(wd).
      
      =INFO REPORT==== 19-Jun-2011::15:42:51 ===
      wd_plug_server starting... PluginId: 4 Path: "/home/a2/src/erl/data/SIG" FileMask: "(?i)(.*)\\.SIG$" 
      ok
      

      (n2@a2-X201)1> application:start(wd).
      ok
      (n2@a2-X201)2> 
      

      (n3@a2-X201)1> application:start(wd).
      ok
      (n3@a2-X201)2> 
      

      At the moment everything is Ok. As written in Erlang documentation: Application is running at node n1@a2-X201

      Now kill node n1: Application was migrated to n2

      (n2@a2-X201)2> 
      =INFO REPORT==== 19-Jun-2011::15:46:28 ===
      wd_plug_server starting... PluginId: 4 Path: "/home/a2/src/erl/data/SIG" FileMask: "(?i)(.*)\\.SIG$" 
      
      

      Continue our game: kill node n2 One more time system works fine. We have our application at node n3

      (n3@a2-X201)2> 
      =INFO REPORT==== 19-Jun-2011::15:48:18 ===
      wd_plug_server starting... PluginId: 4 Path: "/home/a2/src/erl/data/SIG" FileMask: "(?i)(.*)\\.SIG$" 
      

      Now restore nodes n1 and n2. So:

      Erlang R14B (erts-5.8.1) [source] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]
      
      Eshell V5.8.1  (abort with ^G)
      (n1@a2-X201)1> 
      
      Eshell V5.8.1  (abort with ^G)
      (n2@a2-X201)1> 
      

      Nodes n1 and n2 are back.
      Looks like now I have to restart application manually: * Let's do it at node n2 first:

      (n2@a2-X201)1> application:start(wd).
      

      • Looks like it hanged ...
      • Now restart it at n1

      (n1@a2-X201)1> application:start(wd).
      
      =INFO REPORT==== 19-Jun-2011::15:55:43 ===
      wd_plug_server starting... PluginId: 4 Path: "/home/a2/src/erl/data/SIG" FileMask: "(?i)(.*)\\.SIG$" 
      
      ok
      (n1@a2-X201)2> 
      

      It works. And node n2 also has returned OK:

      Eshell V5.8.1  (abort with ^G)
      (n2@a2-X201)1> application:start(wd).
      ok
      (n2@a2-X201)2> 
      

      At node n3 we see:

      =INFO REPORT==== 19-Jun-2011::15:55:43 ===
          application: wd
          exited: stopped
          type: temporary
      

      In general, everything looks ok, as written in documentation, except for delay with starting application at node n2.

      Now kill node n1 once more:

      (n1@a2-X201)2> 
      User switch command
       --> q
      [a2@a2-X201 releases]$ 
      

      Ops ... everything hangs. Application was not restarted at another node.

      Actually, while I was writing this post I've realized that sometime everything id Ok, sometime I have a problem.

      Any ideas, While there could be problems when restoring "primary" node nd killing it one more time?

      解决方案

      As explained over at Learn You Some Erlang (scroll to the bottom), distributed applications only work well when started as part of a release, not when you start them manually with application:start.

      这篇关于Erlang:分布式应用程序奇怪的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆