在哪里找到更明确的错误给定容器错误状态代码? [英] Where to find more explicit errors given container error status codes?

查看:380
本文介绍了在哪里找到更明确的错误给定容器错误状态代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实际上是通过一个 Mesos 栈来运行任务,它使用 Docker 容器。

I am actually running tasks through a Mesos stack, which use Docker containers.

有时,某些任务失败。

这里有一些相关的 TaskStatus 消息和原因:

Here are some of the related TaskStatus messages and reasons:

message: Container exited with status 1 - reason: REASON_COMMAND_EXECUTOR_FAILED
message: Container exited with status 42 - reason: REASON_COMMAND_EXECUTOR_FAILED
message: Container exited with status 137 - reason: REASON_COMMAND_EXECUTOR_FAILED

是有一个对应表,将来自 TaskStatus 的容器错误状态代码与更明确的错误相关联。

Is there a table of correspondance that links container error status codes from TaskStatus message with more explicit errors ?

推荐答案

由于几个原因,命令任务可能会失败,并设置正确的退出代码。例如Docker 1.10设置这样的退出状态代码(从文档此答案):

Command tasks could fail for several reasons and set proper exit code. For example Docker 1.10 set exit status codes like this (from documentation and this answer):


从docker运行的退出代码提供了关于为什么
容器运行失败或为什么退出的信息。当码头运行退出
与非零代码,退出代码遵循chroot标准,请参阅
以下:

The exit code from docker run gives information about why the container failed to run or why it exited. When docker run exits with a non-zero code, the exit codes follow the chroot standard, see below:

125 如果错误是使用Docker守护程序本身

125 if the error is with Docker daemon itself:

$ docker run --foo busybox; echo $?
# flag provided but not defined: --foo   See 'docker run --help'.   

126 如果包含的命令无法调用:

126 if the contained command cannot be invoked:

$ docker run busybox /etc; echo $?
# docker: Error response from daemon: Container command '/etc' could not be invoked.   

127 如果无法找到包含的命令

127 if the contained command cannot be found

$ docker run busybox foo; echo $?
# docker: Error response from daemon: Container command 'foo' not found or does not exist.   127 Exit code of contained command

否则

$ docker run busybox /bin/sh -c 'exit 3'; echo $?
# 3


可以找到另一个退出代码规则 here

Another exit code rule could be found here

| Code  |            Meaning             |         Example         |                                                   Comments                                                   |
|-------|--------------------------------|-------------------------|--------------------------------------------------------------------------------------------------------------|
| 1     | Catchall for general errors    | let "var1 = 1/0"        | Miscellaneous errors, such as "divide by zero" and other impermissible operations                            |
| 2     | Misuse of shell builtins       | empty_function() {}     | Missing keyword or command, or permission problem (and diff return code on a failed binary file comparison). |
| 126   | Command invoked cannot execute | /dev/null               | Permission problem or command is not an executable                                                           |
| 127   | "command not found"            | illegal_command         | Possible problem with $PATH or a typo                                                                        |
| 128   | Invalid argument to exit       | exit 3.14159            | exit takes only integer args in the range 0 - 255 (see first footnote)                                       |
| 128+n | Fatal error signal "n"         | kill -9 $PPID of script | $? returns 137 (128 + 9)                                                                                     |
| 130   | Script terminated by Control-C | Ctl-C                   | Control-C is fatal error signal 2, (130 = 128 + 2, see above)                                                |
| 255*  | Exit status out of range       | exit -1                 | exit takes only integer args in the range 0 - 255                                                            |

根据您的示例:

  • 137 – Out Of Memory; 128 + 9 = 137 (9 coming from SIGKILL) and could be transcoded as a Out Of Memory error and kill.
  • 1 – Command exited with 1. Probably due to invalid configuration, internal application error or invalid input.
  • 42

回答终极生活问题,宇宙和一切


如果您需要更多信息解释状态代码,您可以查看消息字段在Mesos TaskStatus更新中,例如Mesos放置有关OOM的信息。 Mesos日志中也可以找到相同的信息。要调试为什么命令返回非零代码,您可以检查存储在执行器沙箱中的文件,特别是stderr / stdout或特定于命令的日志。

If you need more information to explain status code you can check Message field in Mesos TaskStatus update, for example Mesos put there information about OOM. Same information could be also find in Mesos logs. To debug why command returned non zero code you may check files stored in executor sandbox especially stderr/stdout or command specific logs.

这篇关于在哪里找到更明确的错误给定容器错误状态代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆