将通过xargs启动的多个进程写入同一fifo管道会导致行丢失 [英] Writing from multiple processes launched via xargs to the same fifo pipe causes lines to miss

查看:104
本文介绍了将通过xargs启动的多个进程写入同一fifo管道会导致行丢失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个脚本,可以在监视执行进度的同时并行执行作业.我使用 xargs 和一个命名的fifo管道进行此操作.我的问题是,尽管 xargs 表现良好,但写入管道的某些行却丢失了.知道问题出在哪里吗?

I have a script where I parallelize job execution while monitoring the progress. I do this using xargs and a named fifo pipe. My problem is that I while xargs performs well, some lines written to the pipe are lost. Any idea what the problem is?

例如,以下脚本(基本上是我的带有伪数据的脚本)将产生以下输出,并在挂起时挂起,以等待那些缺少的行:

For example the following script (basically my script with dummy data) will produce the following output and hangs at the end waiting for those missing lines:

$ bash test2.sh 
Progress: 0 of 99
DEBUG: Processed data 0 in separate process
Progress: 1 of 99
DEBUG: Processed data 1 in separate process
Progress: 2 of 99
DEBUG: Processed data 2 in separate process
Progress: 3 of 99
DEBUG: Processed data 3 in separate process
Progress: 4 of 99
DEBUG: Processed data 4 in separate process
Progress: 5 of 99
DEBUG: Processed data 5 in separate process
DEBUG: Processed data 6 in separate process
DEBUG: Processed data 7 in separate process
DEBUG: Processed data 8 in separate process
Progress: 6 of 99
DEBUG: Processed data 9 in separate process
Progress: 7 of 99
##### Script is hanging here (Could happen for any line) #####

#!/bin/bash
clear

printStateInLoop() {
  local pipe="$1"
  local total="$2"
  local finished=0

  echo "Progress: $finished of $total"
  while true; do
    if [ $finished -ge $total ]; then
      break
    fi

    let finished++
    read line <"$pipe"
      # In final script I would need to do more than just logging
    echo "Progress: $finished of $total"
  done
}

processData() {
  local number=$1
  local pipe=$2

  sleep 1 # Work needs time
  echo "$number" >"$pipe"
  echo "DEBUG: Processed data $number in separate process"
}
export -f processData

process() {
  TMP_DIR=$(mktemp -d)
  PROGRESS_PIPE="$TMP_DIR/progress-pipe"
  mkfifo "$PROGRESS_PIPE"

  DATA_VECTOR=($(seq 0 1 99)) # A bunch of data
  printf '%s\0' "${DATA_VECTOR[@]}" | xargs -0 --max-args=1 --max-procs=5 -I {} bash -c "processData \$@ \"$PROGRESS_PIPE\"" _ {} &

  printStateInLoop "$PROGRESS_PIPE" ${#DATA_VECTOR[@]}
}

process
rm -Rf "$TMP_DIR"

另一篇帖子中,我得到了建议在读取行时切换到;做…做完了<"$ pipe" (下面的函数)而不是,同时为true;…读行<"$ pipe"…完成不会在每次读取的行上关闭管道.这样可以减少问题发生的频率,但仍然会发生:某些行丢失了,有时 xargs:bash:被信号13终止.

In another post I got the suggestion to switch to while read line; do … done < "$pipe" (function below) instead of while true; do … read line < "$pipe" … done to not close the pipeline on every line read. This reduces the frequency of the problem but still it happens: Some Lines are missing and sometimes a xargs: bash: terminated by signal 13.

printStateInLoop() {
  local pipe="$1"
  local total="$2"
  local finished=0

  echo "Progress: $finished of $total"
  while [ $finished -lt $total ]; do
    while read line; do
      let finished++
      # In final script I would need to do more than just logging
      echo "Progress: $finished of $total"
    done <"$pipe"
  done
}

SO上的很多人建议使用 parallel pv 来执行此操作.遗憾的是,这些工具在非常有限的目标平台上不可用.相反,我的脚本基于 xargs .

A lot of people on SO suggested to use parallel or pv for doing this. Sadly those tools aren't available on the very limited target platform. Instead my script is based on xargs.

推荐答案

解决方案(由@ markp-fuso和@Dale指出)是创建文件锁.

The solution (as pointed out by @markp-fuso and @Dale) was to create a file lock.

代替:

echo "$number" >"$pipe"

我现在使用 flock 首先创建/等待锁定:

I now use flock to create/wait for a lock first:

flock "$pipe.lock" echo "$number" >"$pipe"

这篇关于将通过xargs启动的多个进程写入同一fifo管道会导致行丢失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆