sed后向引用和命令插值 [英] sed backreferences and command interpolation

查看:106
本文介绍了sed后向引用和命令插值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到一个有趣的问题,使用 only sed 用给定的字符串(例如以下)将短个月的字符串(例如"Oct")替换为相应的数值(例如"10"):

I am having an interesting issue using only sed to substitute short month strings (ex "Oct") with the corresponding number value (ex "10) given a string such as the following:

Oct 14 09:23:35 some other input

直接通过 sed 替换为:

14-10-2013 09:23:25 some other input

以下所有内容实际上都不与解决月份字符串的琐碎问题->数字转换有关;我更想了解尝试完全使用sed解决此问题时遇到的一些奇怪行为.

None of the following is actually relevant to solving the trivial problem of month string -> number conversion; I'm more interested in understanding some weird behavior I encountered while trying to solve this problem entirely with sed.

不尝试任何字符串替换( echo 语句代替脚本中的实际输入):

Without any attempt of this string substitution (the echo statement is in lieu of the actual input in my script):

    ...
    MMM_DD_HH_mm_SS="([A-Za-z]{3}) ([0-9]{2}) (.+:[0-9]{2})"
    echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-\1-\3 \4/"

然后如何将后向引用 \ 1 转换为数字.当然,人们会考虑将命令插值与后向引用作为参数:

Then how to transform the backreference \1 into a number. Of course one thinks of using command interpolation with the backreference as an argument:

...
TestFunc()
{
    echo "received input $1$1"
}
...
echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-$(TestFunc \\1)-\3 \4/"

其中 TestFunc date 命令(如下面的Jotne所建议的)与 echo 'd日期时间组的变体作为输入.这里的TestFunc只是一个 echo ,因为我对函数认为是 $ 1 的值的行为更感兴趣.

Where TestFunc would be a variation of the date command (as proposed by Jotne below) with the echo'd date-time group as an input. Here TestFunc is just an echo because I'm much more interested in the behavior of what the function believes to be the value of $1.

在这种情况下,带有 TestFunc sed 会产生输出:

In this case the sed with TestFunc produces the output:

14-received input OctOct-09:23:35 some other input

这暗示sed实际上是 将反向引用 \ 1 插入命令替换 $(...)中,以供处理TestFunc (它似乎接收 \ 1 作为局部变量 $ 1 ).

Which suggests that sed actually is inserting backreference \1 into the command substitution $(...) for handling by TestFunc (which appears to receive \1 as the local variable $1).

但是,所有尝试对本地 $ 1 执行更多操作的尝试均失败.例如:

However, all attempts to do anything more with the local $1 fail. For example:

TestFunc()
{
    echo "processed: $1$1" > tmp.txt # Echo 1

    if [ "$1" == "Oct" ]; then
       echo "processed: 10"
    else
       echo "processed: $1$1"        # Echo 2
    fi
}

返回:

14-processed: OctOct-09:23:35 some other input

$ 1 已被替换为Echo 2,但 tmp.txt 包含值 processed:\ 1 \ 1 ;好像没有将后向引用插入命令替换中.甚至更奇怪,如果 if 条件失败,并出现 $ 1 !="Oct",但是它落入 echo 语句,该语句指示 $ 1 =十月".

$1 has been substituted into Echo 2, yet tmp.txt contains the value processed: \1\1; as if the backreference is not being inserted into the command substitution. Even weirder, the if condition fails with $1 != "Oct", yet it falls through to an echo statement which indicates $1 = "Oct".

我的问题是,为什么在Echo 2而不是Echo 1的情况下可以使用反向引用?我怀疑反向引用插入根本不起作用(考虑到 TestFunc if 语句的失败),但是正在进行一些细微的操作,使替换看起来像在Echo 2的情况下可以正常工作;那是什么精妙之处?

My question is why is the backreference insertion working in the case of Echo 2 but not Echo 1? I suspect that the backreference insertion isn't working at all (given the failure of the if statement in TestFunc) but rather something subtle is going on that makes the substitution appear to work correctly in the case of Echo 2; what is that subtlety?

经过进一步的思考,我相信我了解发生了什么事

On further reflection I believe I understand what is going on:

  • \\ 1 作为文字 \ 1 传递到命令替换子例程/子函数.这就是为什么子函数中的相等性测试失败的原因.

  • \\1 is passed to the command substitution subroutine / child function as the literal \1. This is why equality test within the child function is failing.

但是 echo 函数 将字符串 \\ 1 正确地处理为 $ 1 .因此, echo"aa $ 1aa" 将命令替换的结果作为 aa \ 1aa 返回给 sed .其他功能,例如 rev 也可以将 $ 1 视为 \ 1 .

however the echo function is correctly handling the string \\1 as $1. So echo "aa$1aa" returns the result of the command substitution to sed as aa\1aa. Other functions such as rev also "see" $1 as \1.

sed 然后在 aa \ 1aa 中将 \ 1 插入为 Oct 或任何反向引用,将 aaOctaa 返回给用户.

sed then interpolates \1 in aa\1aa as Oct or whatever the backreference is, to return aaOctaa to the user.

由于正则表达式中的命令替换显然有效,所以如果 sed 替换 \\ 1 (或 \ 1 )的值,这真的很酷(无论如何),并使用后向引用之前执行命令替换 $(...);这将大大提高sed的力量...

Since command substitution within regexes clearly works, it would be really cool if sed replaced the value of \\1 (or \1, whatever) with the backreference before executing the command substitution $(...); this would significantly increase sed's power...

推荐答案

这可能对您有用(GNU sed):

This might work for you (GNU sed):

s/$/\nJan01...Oct10Nov11Dec12/;s/(...) (..) (..:..:.. .*)\n.*\1(..).*/\2-\4-2013 \3/;s/\n.*//' file

在该行的末尾添加一个查找,并使用向后引用对其进行匹配,以确保在所有情况下都删除查找表.

Add a lookup to the end of the line and use the back reference to match on it making sure to remove the lookup table in all cases.

下面是将反向引用传递给函数的示例:

Here's an example of passing a backreference to a function:

f(){ echo "x$1y$1z"; }
echo a b c | sed -r  's/(.) (.) (.)/'"$(f \\2)"'/'

返回:

xbybz

HTH

这篇关于sed后向引用和命令插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆