重启孩子时,主管是否阻止通话? [英] Does supervisor block calls while restarting children?

查看:168
本文介绍了重启孩子时,主管是否阻止通话?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解这里发生了什么:



我有一个主管周期性重新启动一个客户端,而不会触发 MaxR,MaxT 机制。客户端刚刚崩溃,从不触发速率限制。



将有另一种机制使用 supervisor:which_children / 1 delete_child / 2,start_child / 2 以使该组儿童适应现实(扫描USB设备试图找到每个设备有一个主管小孩)。



这通常会像一个安全网络的速率限制,但奇怪的是,它看起来像删除和启动孩子的机制根本不被调用。



为了找出发生了什么,我从shell中调用了 supervisor:which_children / 1 ,它看起来像是只是阻塞的通话



可以的是,当忙于尝试重新启动一个孩子时,主管的呼叫被阻止?



附录



看起来孩子开始时发生崩溃:

  = SUPERVISOR REPORT ==== 29-Mar-201 1 :: 21:36:20 === 
主管:{local,gateway_sup}
上下文:start_error
原因:{'EXIT',{timeout,{gen_server,call,[< ; 0.155.0>,late_init]}}}
罪犯:[{pid,<0.76.0>},
{name,gw_3_5},
{mfa,{channel,start_link ,
[[{gateways,[{left,108},{right,103}]}],
{3,5}]},
{restart_type,transient},
{shutdown,10000},
{child_type,worker}]


解决方案

除了讨论之外,问题的答案是:



当重新启动启动期间失败的孩子时,主管在其进程中循环(它是一个gen_server内部)没有处理任何API调用它。



所以,如果管理员的速率限制被配置为在启动错误时不会触发这些孩子。我的例子中有一个缓慢的启动(特别是错误)。



所以如果管理员永远循环尝试重新启动一个孩子,那么对于任何调用它是不可达到的。这通常是坏的。


I'm trying to understand what's happening here:

I have a supervisor that is cyclically restarting one client without triggering the MaxR, MaxT mechanism. The client just crashes slowly enough never to trigger the rate limitation.

There would have been another mechanism that uses supervisor:which_children/1 and delete_child/2, start_child/2 to adapt the set of children to reality (its scanning for USB devices trying to have one supervisor child per device found).

This would normally behave like a safety net to the rate limitation, but strangely it looks like the mechanism that deletes and starts children is not called at all.

To find out what's going on I called supervisor:which_children/1 from the shell and it looks like the call just blocks and never returns.

Can it be that calls to the supervisor are blocked while it is busy trying to restart a child?

Addendum:

it looks like the crash happens during child start:

=SUPERVISOR REPORT==== 29-Mar-2011::21:36:20 ===
     Supervisor: {local,gateway_sup}
     Context:    start_error
     Reason:     {'EXIT',{timeout,{gen_server,call,[<0.155.0>,late_init]}}}
     Offender:   [{pid,<0.76.0>},
              {name,gw_3_5},
              {mfa,{channel,start_link,
                            [[{gateways,[{left,108},{right,103}]}],
                             {3,5}]}},
              {restart_type,transient},
              {shutdown,10000},
              {child_type,worker}]

解决方案

The answer to the question besides the discussion is:

When restarting a child that fails during startup the supervisor loops inside its process (it is a gen_server internally) not handling any API calls to it.

So it is especially bad if the rate limitation of the supervisor is configured that it will not trigger on startup errors of the children. I have a slow startup (especially on error) in my example.

So if the supervisor loops forever trying to restart a child it is not reachable for any calls to it ... which is usually bad.

这篇关于重启孩子时,主管是否阻止通话?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆