java.io.IOException:错误= 11 [英] java.io.IOException: error=11

查看:840
本文介绍了java.io.IOException:错误= 11的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了Java ProcessBuilder怪异的问题.下面显示了代码(以稍微简化的形式)

I am experiencing a weird problem with the Java ProcessBuilder. The code is shown below (in a slightly simplified form)

public class Whatever implements Runnable
{

public void run(){
        //someIdentifier is a randomly generated string
        String in = someIdentifier + "input.txt";
        String out = someIdentifier + "output.txt";
        ProcessBuilder builder = new ProcessBuilder("./whateveer.sh", in, out);
        try {
            Process process = builder.start();
            process.waitFor();
        } catch (IOException e) {
            log.error("Could not launch process. Command: " + builder.command(), e);
        } catch (InterruptedException ex) {
            log.error(ex);
        }
}

}

whatever.sh读取:

whatever.sh reads:

R --slave --args $1 $2 <whatever1.R >> r.log    

Whatever实例的负载将提交到固定大小(35)的ExecutorService.应用程序的其余部分等待它们全部完成,并使用CountdownLatch实施.在抛出以下异常之前,一切正常运行了几个小时(Scientific Linux 5.0,Java版本"1.6.0_24"):

Loads of instances of Whatever are submitted to an ExecutorService of fixed size (35). The rest of the application waits for all of them to finish- implemented with a CountdownLatch. Everything runs fine for several hours (Scientific Linux 5.0, java version "1.6.0_24") before throwing the following exception:

java.io.IOException: Cannot run program "./whatever.sh": java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.ProcessBuilder.start(Unknown Source)
... rest of stack trace omitted...

有人知道这意味着什么吗?根据java.io.IOException: error=11的google/bing搜索结果,它不是最常见的例外情况,我完全感到困惑.

Does anyone have an idea what this means? Based on the google/bing search results for java.io.IOException: error=11, it is not the most common of exceptions and I am completely baffled.

我很疯狂并且没有受过良好教育的猜测是,我试图在同一时间启动同一文件的线程太多.但是,重现此问题需要花费几个小时的CPU时间,因此我没有尝试使用较小的数字.

My wild and not so educated guess is that I have too many threads trying to launch the same file at the same time. However, it takes hours of CPU time to reproduce the problem, so I have not tried with a smaller number.

任何建议都将不胜感激.

Any suggestions are greatly appreciated.

推荐答案

error=11几乎可以肯定是EAGAIN错误代码:

The error=11 is almost certainly the EAGAIN error code:

$ grep EAGAIN asm-generic/errno-base.h 
#define EAGAIN      11  /* Try again */

clone(2)系统调用记录了EAGAIN错误返回:

The clone(2) system call documents an EAGAIN error return:

   EAGAIN Too many processes are already running.

fork(2)系统调用记录了两个EAGAIN错误返回:

The fork(2) system call documents two EAGAIN error returns:

   EAGAIN fork() cannot allocate sufficient memory to copy the
          parent's page tables and allocate a task structure for
          the child.

   EAGAIN It was not possible to create a new process because
          the caller's RLIMIT_NPROC resource limit was
          encountered.  To exceed this limit, the process must
          have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE
          capability.

如果您的内存确实不足,它几乎可以肯定会显示在系统日志中.检查dmesg(1)输出或/var/log/syslog中是否有任何有关系统内存不足的潜在消息. (其他事情会中断.这似乎不太合理.)

If you were really that low on memory, it would almost certainly show in the system logs. Check dmesg(1) output or /var/log/syslog for any potential messages about low system memory. (Other things would break. This doesn't seem too plausible.)

每个用户对进程的限制或系统范围内最大进程数的可能性更大.也许您的进程之一没有正确地捕获僵尸?通过随时间检查ps(1)输出,将很容易发现这一点:

Much more likely is running into either the per-user limit on processes or system-wide maximum number of processes. Perhaps one of your processes isn't properly reapting zombies? This would be very easy to spot by checking ps(1) output over time:

while true ; do ps auxw >> ~/processes ; sleep 10 ; done

(也许每分钟或十分钟检查一次,如果确实确实需要几个小时才能遇到麻烦.)

(Maybe check every minute or ten minutes if it really does take hours before you're in trouble.)

如果您没有收获僵尸,请阅读对ProcessBuilder进行的所有操作,以使用waitpid(2)收割死去的孩子.

If you're not reaping zombies, then read up on whatever you must do to ProcessBuilder to use waitpid(2) to reap your dead children.

如果合法运行的进程超出了rlimits的允许范围,则需要在bash(1)脚本中使用ulimit(如果以root运行),或者在/etc/security/limits.conf中为nproc属性.

If you're legitimately running more processes than your rlimits allow, you'll need to use ulimit in your bash(1) scripts (if running as root) or set higher limits in /etc/security/limits.conf for the nproc property.

如果您正在遇到系统范围的进程限制,则可能需要在/proc/sys/kernel/pid_max中写入一个较大的值.有关一些(简短的)详细信息,请参见proc(5).

If you are instead running into the system-wide process limits, you might need to write a larger value into /proc/sys/kernel/pid_max. See proc(5) for some (short) details.

这篇关于java.io.IOException:错误= 11的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆