sed后向引用和命令插值 [英] sed backreferences and command interpolation
问题描述
我遇到一个有趣的问题,使用 only sed 用给定的字符串(例如以下)将短个月的字符串(例如"Oct")替换为相应的数值(例如"10"):>
I am having an interesting issue using only sed to substitute short month strings (ex "Oct") with the corresponding number value (ex "10) given a string such as the following:
Oct 14 09:23:35 some other input
直接通过 sed
替换为:
14-10-2013 09:23:25 some other input
以下所有内容实际上都不与解决月份字符串的琐碎问题->数字转换有关;我更想了解尝试完全使用sed解决此问题时遇到的一些奇怪行为.
None of the following is actually relevant to solving the trivial problem of month string -> number conversion; I'm more interested in understanding some weird behavior I encountered while trying to solve this problem entirely with sed.
不尝试任何字符串替换( echo
语句代替脚本中的实际输入):
Without any attempt of this string substitution (the echo
statement is in lieu of the actual input in my script):
...
MMM_DD_HH_mm_SS="([A-Za-z]{3}) ([0-9]{2}) (.+:[0-9]{2})"
echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-\1-\3 \4/"
然后如何将后向引用 \ 1
转换为数字.当然,人们会考虑将命令插值与后向引用作为参数:
Then how to transform the backreference \1
into a number. Of course one thinks of using command interpolation with the backreference as an argument:
...
TestFunc()
{
echo "received input $1$1"
}
...
echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-$(TestFunc \\1)-\3 \4/"
其中 TestFunc
是 date
命令(如下面的Jotne所建议的)与 echo
'd日期时间组的变体作为输入.这里的TestFunc只是一个 echo
,因为我对函数认为是 $ 1
的值的行为更感兴趣.
Where TestFunc
would be a variation of the date
command (as proposed by Jotne below) with the echo
'd date-time group as an input. Here TestFunc is just an echo
because I'm much more interested in the behavior of what the function believes to be the value of $1
.
在这种情况下,带有 TestFunc
的 sed
会产生输出:
In this case the sed
with TestFunc
produces the output:
14-received input OctOct-09:23:35 some other input
这暗示sed实际上是 将反向引用 \ 1
插入命令替换 $(...)
中,以供处理TestFunc
(它似乎接收 \ 1
作为局部变量 $ 1
).
Which suggests that sed actually is inserting backreference \1
into the command substitution $(...)
for handling by TestFunc
(which appears to receive \1
as the local variable $1
).
但是,所有尝试对本地 $ 1
执行更多操作的尝试均失败.例如:
However, all attempts to do anything more with the local $1
fail. For example:
TestFunc()
{
echo "processed: $1$1" > tmp.txt # Echo 1
if [ "$1" == "Oct" ]; then
echo "processed: 10"
else
echo "processed: $1$1" # Echo 2
fi
}
返回:
14-processed: OctOct-09:23:35 some other input
$ 1
已被替换为Echo 2,但 tmp.txt
包含值 processed:\ 1 \ 1
;好像没有将后向引用插入命令替换中.甚至更奇怪,如果 if
条件失败,并出现 $ 1
!="Oct",但是它落入 echo
语句,该语句指示 $ 1
=十月".
$1
has been substituted into Echo 2, yet tmp.txt
contains the value processed: \1\1
; as if the backreference is not being inserted into the command substitution. Even weirder, the if
condition fails with $1
!= "Oct", yet it falls through to an echo
statement which indicates $1
= "Oct".
我的问题是,为什么在Echo 2而不是Echo 1的情况下可以使用反向引用?我怀疑反向引用插入根本不起作用(考虑到 TestFunc
中 if
语句的失败),但是正在进行一些细微的操作,使替换看起来像在Echo 2的情况下可以正常工作;那是什么精妙之处?
My question is why is the backreference insertion working in the case of Echo 2 but not Echo 1? I suspect that the backreference insertion isn't working at all (given the failure of the if
statement in TestFunc
) but rather something subtle is going on that makes the substitution appear to work correctly in the case of Echo 2; what is that subtlety?
经过进一步的思考,我相信我了解发生了什么事
On further reflection I believe I understand what is going on:
-
\\ 1
作为文字\ 1
传递到命令替换子例程/子函数.这就是为什么子函数中的相等性测试失败的原因.
\\1
is passed to the command substitution subroutine / child function as the literal\1
. This is why equality test within the child function is failing.
但是 echo
函数 将字符串 \\ 1
正确地处理为 $ 1
.因此, echo"aa $ 1aa"
将命令替换的结果作为 aa \ 1aa
返回给 sed
.其他功能,例如 rev
也可以将 $ 1
视为 \ 1
.
however the echo
function is correctly handling the string \\1
as $1
. So echo "aa$1aa"
returns the result of the command substitution to sed
as aa\1aa
. Other functions such as rev
also "see" $1
as \1
.
sed
然后在 aa \ 1aa
中将 \ 1
插入为 Oct
或任何反向引用,将 aaOctaa
返回给用户.
sed
then interpolates \1
in aa\1aa
as Oct
or whatever the backreference is, to return aaOctaa
to the user.
由于正则表达式中的命令替换显然有效,所以如果 sed
替换 \\ 1
(或 \ 1
)的值,这真的很酷(无论如何),并使用后向引用之前执行命令替换 $(...)
;这将大大提高sed的力量...
Since command substitution within regexes clearly works, it would be really cool if sed
replaced the value of \\1
(or \1
, whatever) with the backreference before executing the command substitution $(...)
; this would significantly increase sed's power...
推荐答案
这可能对您有用(GNU sed):
This might work for you (GNU sed):
s/$/\nJan01...Oct10Nov11Dec12/;s/(...) (..) (..:..:.. .*)\n.*\1(..).*/\2-\4-2013 \3/;s/\n.*//' file
在该行的末尾添加一个查找,并使用向后引用对其进行匹配,以确保在所有情况下都删除查找表.
Add a lookup to the end of the line and use the back reference to match on it making sure to remove the lookup table in all cases.
下面是将反向引用传递给函数的示例:
Here's an example of passing a backreference to a function:
f(){ echo "x$1y$1z"; }
echo a b c | sed -r 's/(.) (.) (.)/'"$(f \\2)"'/'
返回:
xbybz
HTH
这篇关于sed后向引用和命令插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!