带有过程替代误区的T恤 [英] Tee with process substitution misunderstanding

查看:63
本文介绍了带有过程替代误区的T恤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为LDAP条目编写漂亮的打印机,该打印机仅获取一次LDAP根记录,然后将输出通过管道传送到 tee 中,从而为每个部分调用漂亮的打印机.

为便于说明,假设我的 group_entry 函数返回特定LDAP DN的LDIF.详细信息并不重要,因此可以说它总是返回:

  dn:cn = foo,dc = example,dc = comcn:foo所有者:uid = foo,dc = example,dc = com所有者:uid = bar,dc = example,dc = com成员:uid = foo,dc = example,dc = com成员:uid = baz,dc = example,dc = com成员:uid = quux,dc = example,dc = com自定义:abc123 

通过一些 grep 'ing和 cut 'ing,我可以轻松地分别提取所有者和成员.然后,我可以将这些辅助DN传递到另一个LDAP搜索查询中,以获取其真实名称.举个例子,假设我有一个 pretty_print 函数,该函数在LDAP属性名称上进行了参数设置,它完成了我刚才提到的所有操作,然后使用AWK很好地格式化了所有内容:

  $ group_entry |pretty_print所有者拥有者:富先生酒吧Dr Bar$ group_entry |pretty_print成员成员:富先生baz Bazzy McBazFacequux艺术家前身为Quux 

这些单独工作正常,但是当我尝试将它们一起 tee 时,什么也没发生:

  $ group_entry |tee>(pretty_print所有者)|pretty_print成员成员:[坐在那里等待Ctrl + C] 

显然,我对它应该如何工作有一些误解,但它使我逃脱了.我在做什么错了?


编辑为了完整起见,这是我的完整脚本:

 #!/usr/bin/env bash设置-eu -o pipefailLDAPSEARCH ="ldapsearch -xLLL"group_entry(){本地组="$ 1"$ {LDAPSEARCH}(&(objectClass = posixGroup)(cn = $ {group}))"}get_attribute(){本地attr ="$ 1"grep"$ {attr}:" |切-d" -f2}get_names(){#我们将空白行从LDIF条目中删除,那么我们总是有"dn"#后跟"cn"记录;我们剥去属性名称,然后#将这些行连接起来,然后进行排序.因此,我们得到以下的排序列表:#{{{distinguished_name}} {{real_name}}xargs -n1 -J%$ {LDAPSEARCH} -s base -b%cn \|grep -v"^ $" \|切-d" -f2- \|粘贴 - - \|种类}pretty_print(){本地attr ="$ 1"本地-A pretty =([成员] =成员" [所有者] =所有者")get_attribute"$ {attr}" \|get_names \|gawk -F'\ t'-v title ="$ {pretty [$ {attr}]}:"'开始{打印标题}{print-",gensub(/^ uid =([^,] +),.* $/,"\\ 1","g",$ 1),"\ t",$ 2}'}#FIXME我不知道为什么带有进程替换的Tee在这里不起作用group_entry"$ 1" |pretty_print所有者group_entry"$ 1" |pretty_print成员 

解决方案

您描述的行为看起来很像在C语言程序中出现的情况,该C语言程序派生并执行了另一个程序(就像shell和xargs一样)没有正确处理所有打开的文件描述符.您可能会遇到以下情况:进程 p1 不会终止,因为它正在等待观察其标准输入上的EOF,但是它永远不会终止,因为另一个进程 p2 拥有一个提供 p1 的标准输入的管道的写端的打开文件描述符,而 p2 本身正在等待 p1 终止或执行其他动作.

尽管如此,在这方面,我看不到您的管道有任何内在的错误,并且我也没有使用这种更简单的模型来重现挂起的问题...

  echo"foo" |三通>(cat)|猫 

...在 bash 的4.2.46版中.不过,您的 bash 版本(即使是相同版本)或 xargs 中也可能存在相关的错误,但这只是推测.我不认为您的管道应该像您所说的那样挂起,但是我不准备开始指责.

无论如何,即使您的管道没有挂起,它也没有您想要的语义,就像@chepner在注释中指出的那样. pretty_print成员将在其标准输入上接收 tee 的输出,并且将同时包含 both group_entry 和 pretty_print owner 的输出.您可以考虑以不同的方式实现它:由于tee可以通过两种以上的方式多路复用输入,因此您可以用一块石头杀死两只鸟:

  group_entry"$ 1" |tee>(pretty_print所有者)>(pretty_print成员) 

但是,这留下了两个 pretty_print 执行的输出混合在一起的可能性,并且还会回显 group_entry 的输出.可以想象,您可以过滤出 group_entry 输出,但是要避免混淆,您需要确保两个 pretty_print 命令按顺序运行.这为基于 的方法提出了一个问题,因为如果 tee 的输出中的任何一个阻塞,则整个管道可能会停顿.

一种解决方案是将一个或两个 pretty_print 命令的输出重定向到文件.另外,如果必须将两个输出都输出到stdout,那么我认为除了捕获 group_entry 输出并将其分别提供给每个 pretty_print 作业之外,别无选择.您可以将其捕获到文件中,但这是不必要的,并且有点混乱.考虑一下这个:

  entry_lines = $(group_entry"$ 1")pretty_print所有者<<<"$ entry_lines"pretty_print成员<<<"$ entry_lines" 

使用命令替换在外壳变量(包括换行符)中捕获 group_entry 的输出,并使用here字符串将其重播到每个 pretty_print 进程中./p>

I am trying to write a pretty printer for LDAP entries which only fetches the root LDAP record once and then pipes the output into tee that invokes the pretty printer for each section.

For illustration's sake, say my group_entry function returns the LDIF of a specific LDAP DN. The details of which aren't important, so let's say it always returns:

dn: cn=foo,dc=example,dc=com
cn: foo
owner: uid=foo,dc=example,dc=com
owner: uid=bar,dc=example,dc=com
member: uid=foo,dc=example,dc=com
member: uid=baz,dc=example,dc=com
member: uid=quux,dc=example,dc=com
custom: abc123

I can easily extract the owners and members separately with a bit of grep'ing and cut'ing. I can then pipe those secondary DNs into another LDAP search query to get their real names. For sake of example, let's say I have a pretty_print function, that is parametrised on the LDAP attribute name, which does all that I just mentioned and then formats everything nicely with AWK:

$ group_entry | pretty_print owner
Owners:
foo    Mr Foo
bar    Dr Bar

$ group_entry | pretty_print member
Members:
foo    Mr Foo
baz    Bazzy McBazFace
quux   The Artist Formerly Known as Quux

These work fine individually, but when I try to tee them together, nothing happens:

$ group_entry | tee >(pretty_print owner) | pretty_print member
Members:
[Sits there waiting for Ctrl+C]

Obviously I have some misunderstanding about how this is supposed to work, but it escapes me. What am I doing wrong?


EDIT For sake of completeness, here's my full script:

#!/usr/bin/env bash

set -eu -o pipefail

LDAPSEARCH="ldapsearch -xLLL"

group_entry() {
  local group="$1"
  ${LDAPSEARCH} "(&(objectClass=posixGroup)(cn=${group}))"
}

get_attribute() {
  local attr="$1"
  grep "${attr}:" | cut -d" " -f2
}

get_names() {
  # We strip blank lines out of the LDIF entry, then we always have "dn"
  # followed by "cn" records; we strip off the attribute name and
  # concatenate those lines, then sort. So we get a sorted list of:
  # {{distinguished_name}} {{real_name}}
  xargs -n1 -J% ${LDAPSEARCH} -s base -b % cn \
  | grep -v "^$" \
  | cut -d" " -f2- \
  | paste - - \
  | sort
}

pretty_print() {
  local attr="$1"
  local -A pretty=([member]="Members" [owner]="Owners")

  get_attribute "${attr}" \
  | get_names \
  | gawk -F'\t' -v title="${pretty[${attr}]}:" '
    BEGIN { print title }
    { print "-", gensub(/^uid=([^,]+),.*$/, "\\1", "g", $1), "\t", $2 }
  '
}

# FIXME I don't know why tee with process substitution doesn't work here
group_entry "$1" | pretty_print owner
group_entry "$1" | pretty_print member

解决方案

The behavior you describe looks very much like a situation that can arise in a C program that forks and exec's another program (as the shell and xargs both certainly do) without properly handling all the open file descriptors. You can be left in a situation where a process p1 does not terminate because it's waiting to observe EOF on its standard input, but it never will do because another process p2 holds an open file descriptor for the write end of the pipe that provides p1's standard input, and p2 is itself waiting for p1 to terminate or perform some other action.

Nevertheless, I don't see anything inherently wrong with your pipeline in that regard, and I do not reproduce the hang with this simpler model ...

echo "foo" | tee >(cat) | cat

... in version 4.2.46 of bash. It may be that there is nevertheless a related bug in your version of bash (even if its the same one) or in xargs, but that's speculative. I do not think that your pipeline should hang as you say it does, but I'm not prepared to start pointing fingers.

In any event, even if your pipeline did not hang, it does not have the semantics you want, as @chepner pointed out in comments. The pretty_print member will receive the output of tee on its standard input, and that will include both the output of group_entry and the output of pretty_print owner. You could consider implementing it differently: since tee can multiplex input more than two ways, you may kill two birds with one stone by doing this:

group_entry "$1" | tee >(pretty_print owner) >(pretty_print member)

But that leaves open the possibility that the output of the two pretty_print executions will be intermingled, and also echos the group_entry output. You could conceivably filter out the group_entry output, but to avoid the intermingling, you need to ensure that the two pretty_print commands run sequentially. That presents a problem for a tee-based approach, because if any of tee's outputs block then the whole pipeline can stall.

One solution would be to redirect the output of one or both pretty_print commands to a file. Alternatively, if it is essential that both outputs go to stdout, then I see no good alternative but to capture the group_entry output, and feed it separately to each pretty_print job. You could capture it to a file, but that's unnecessary, and a bit messy. Consider this instead:

entry_lines=$(group_entry "$1")
pretty_print owner  <<<"$entry_lines"
pretty_print member <<<"$entry_lines"

That uses command substitution to capture the output of group_entry in a shell variable (including newlines), and uses a here string to replay it into each pretty_print process.

这篇关于带有过程替代误区的T恤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆