为什么 n 而不是 b 或 d 或什么都不改变 sed 在此脚本中的行为? [英] Why does an n instead of b or d or nothing change the behaviour of sed in this script?

查看:43
本文介绍了为什么 n 而不是 b 或 d 或什么都不改变 sed 在此脚本中的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在开发问题的答案时如何提取Unix 中两个模式之间的内容,我在 sed 中遇到了一个我无法解释的行为 - 你能吗?

While developing an answer for the question How to extract content between two patterns in Unix, I came across a behaviour in sed which I can't explain — can you?

数据文件:data

Goodbye

select *   
from dep  
where jkdsfj  

select *   
from sal   
where jkdsfj  

select elephants
from abject poverty
join flying tigers
where abelone = shellfish;

select mouse
from toolset
join animals where tail = cord
and buttons = legs

Hello

目标是选择单词fromwhere 之间的文本.

The objective is to select the text between the words from and where.

以下是脚本的 4 个变体:

Here are 4 variants of a script:

  • script.16

/from/,/where/ { s/.*from *//; s/ *where.*//; /^ *$/d; p;    }

  • script.17

    # Bust by final n;
    /from/,/where/ { s/.*from *//; s/ *where.*//; /^ *$/d; p; n; }
    

  • script.18

    /from/,/where/ { s/.*from *//; s/ *where.*//; /^ *$/d; p; d; }
    

  • script.19

    /from/,/where/ { s/.*from *//; s/ *where.*//; /^ *$/d; p; b
    }
    

  • 这些都适用于 BSD (Mac OS X) sed 和 GNU sed.最后一个脚本可以使用 b;} 并且它可以与 GNU sed 一起使用,但 BSD sed 拒绝它.

    These all work with both BSD (Mac OS X) sed and GNU sed. The last script could use b; } and it would work with GNU sed but BSD sed rejects it.

    问题是 script.17 的输出与其他 3 个不同,我不明白为什么:

    The trouble is that the output from script.17 is different from the other 3, and I can't fathom why:

    $ sed -n -f script.16 data
    dep  
    sal   
    abject poverty
    join flying tigers
    toolset
    join animals
    $ sed -n -f script.17 data
    dep  
    select *   
    abject poverty
    toolset
    and buttons = legs
    Hello
    $
    

    为什么输出中有 select *and buttons = legHello?

    Why is the select * and the and buttons = legs and Hello in the output?

    $ sed -n -f script.18 data
    dep  
    sal   
    abject poverty
    join flying tigers
    toolset
    join animals
    $ sed -n -f script.19 data
    dep  
    sal   
    abject poverty
    join flying tigers
    toolset
    join animals
    $ 
    

    为什么使用 n 会像这样改变 sed 的行为?从我尝试过的诊断打印"的一些变体来看,n 似乎阻止了 sed 在正确看到 where 时识别,但是 bd 都跳到下一个循环,就像 n 正常那样,但有些不同.

    Why does the use of n change the behaviour of sed like this? From some variations I've tried with diagnostic 'printing', it appears that the n prevents sed from recognizing when it has seen the where properly, but b and d both jump to the next cycle, rather like n does normally, but something is different.

    鉴于两个独立的实现做同样的事情,我不得不假设它是故意的,但是......为什么?

    Given that two independent implementations do the same thing, I have to assume it is intentional, but … why?

    推荐答案

    总结

    问题在于范围以及评估范围时模式空间中的内容.

    Summary

    The issue is with the range and what is in the pattern space when the range is evaluated.

    sed 中的范围端点在评估范围时与模式空间的内容相匹配,而不是与原始输入行相关.因此,对于 sed -n '/start/,/end/{...}',重要的是命令开头的模式空间中的内容,而不是模式中的内容处理完命令或 n 导致读取更多行后,请稍后使用空格.

    Range endpoints in sed are matched against the contents of the pattern space at the time the range is evaluated, not with respect to the original input lines. Thus, for sed -n '/start/,/end/{...}', what matters is what is in the pattern space at the beginning of the commands, not what is in the pattern space later after commands have been processed or n has caused more lines to be read.

    p;n 与范围结合的问题可以用更简单的代码来说明.请注意,与 bd 不同,命令 n 读取一行.因此, sed -n 'p;n' 每隔一行打印一次.例如:

    The problem with p;n in combination with a range can be illustrated with code that is much simpler. Note that, unlike b and d, the command n reads in a line. Consequently, sed -n 'p;n' prints every other line. For example:

    $ seq 5 | sed -n 'p;n'
    1
    3
    5
    

    现在,结合一个范围观察p;n:

    Now, observe p;n in combination with a range:

    $ seq 5 | sed -n '/1/,/3/{p;n;}'
    1
    3
    

    以上按预期工作.然而,以下内容令人惊讶:

    The above works as expected. The following, however, surprises:

    $ seq 5 | sed -n '/1/,/2/{p;n;}'
    1
    3
    5
    

    包含 2 的行被 n 命令读入,然后立即被丢弃.当计算范围 /1/,/2/ 时,包含 2 的行不会出现在模式空间中.因此,sed 永远不会看到 /1/,/2/ 的结尾,它一直认为它在范围内.

    The line containing 2 is read in by the n command and is then promptly discarded. The line containing 2 does not appear in the pattern space when the range /1/,/2/ is evaluated. Thus, sed never sees the end of /1/,/2/ and it keeps on going thinking it is within the range.

    现在,让我们考虑您的脚本 17,稍作修改:

    Now, let's consider your script 17, slightly modified:

    sed -n '/from/,/where/ { s/.*from */BEGIN/; s/ *where.*/END/; /^ *$/d; p; n; }' data
    BEGINdep  
    select *   
    END
    BEGINabject poverty
    END
    BEGINtoolset
    and buttons = legs
    Hello
    

    在这里,我们看到范围 /from/,/where/ 从出现 from 到下一次出现 where出现在模式缓冲区中在评估范围时在命令开始处.n 读取的 where 实例永远不会结束一个范围.

    Here, we see that the range /from/,/where/ continues from an appearance of from to the next time that where appears in the pattern buffer at the start of the command when the range is evaluated. An instance of where that is read by n never ends a range.

    考虑 /1/,/END/ 范围,其中 END 永远不会出现在文件中:

    Consider the range /1/,/END/ where END nevers appears in the file:

    $ seq 5 | sed -n 's/3/END/; /1/,/END/{p;n}'
    1
    END
    

    尽管 END 从未出现在文件中,但它在计算范围时出现在模式空间中.因此,它结束了范围.

    Even though END nevers appears in the file, it appears in the pattern space at the time that the range is evaluated. Thus, it ends the range.

    作为另一个演示,让我们更改上述命令的顺序.下面,我们看到 END 虽然被打印出来但并没有结束范围:

    As one more demonstration, let's change the order of the above commands. Below, we see that END does not end the range though it gets printed out:

    $ seq 5 | sed -n ' /1/,/END/{s/3/END/; p; n}'
    1
    END
    5
    

    这是因为在计算范围时 END 不在模式空间中.因此,sed 永远不会看到范围的结尾.

    This is because END is not in the pattern space when the range is evaluated. Thus, sed never sees the end to the range.

    这篇关于为什么 n 而不是 b 或 d 或什么都不改变 sed 在此脚本中的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆