解析行并在适当位置更改一些文本 [英] Parsing line and changing some text in place

查看:61
本文介绍了解析行并在适当位置更改一些文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何解析ExtData标签的日志文件(不是完整的xml文件,但是它具有xml数据的一部分),该标签具有一些名称/值"对,我需要这样屏蔽它:例如:

How do I parse a log file (not a full xml file, but it has some portion of xml data) for ExtData tags, which has some name-value pair, I need to mask it like this : For eg:

<ExtData>Name="Jason" Value="Special"</ExtData>
to
<ExtData>Name="Jason" Value="XXXXXXX"</ExtData>

仅当Name是Jason或某些名称集时才需要屏蔽上述ExtData标记值,而不是对每个Name都这样.

I need to mask ExtData tag value like above only when Name is Jason or some set of name, and not for every Name.

例如:如果"DummyName"不在名称集中,那么我不想在下面的行中更改它.

For eg: if "DummyName" is not in set of names, than I do not want to change this below line.

<ExtData>Name="DummyName" Value="Garbage"</ExtData>

例如:如果"DummyName"不在名称集中,那么我不想在下面的行中更改此名称. (请注意,该值为"Jason")

For eg: if "DummyName" is not in set of names, than I do not want to change this below line. (Please note that the value is "Jason")

<ExtData>Name="DummyName" Value="Jason"</ExtData>

例如:如果"DummyJasonName"不在名称集中,那么我不想在下面这行进行更改. (请注意,假人"和姓名"之间的杰森")

For eg: if "DummyJasonName" is not in set of names, than I do not want to change this below line. (Note "Jason" in between "Dummy" and "Name")

<ExtData>Name="DummyJasonName" Value="Garbage"</ExtData>

我需要在bash/shell脚本中完成所有这些操作.

I need to do all this in bash/shell script.

最重要的是,我想通过sed/awk/match命令读取文件. 检查该行中的ExtData标签.如果匹配,请读取ExtData标记和/ExtData标记之间的文本.在此多行文本中,提取名称".如果名称"来自一组名称,则用等号"X"屏蔽其对应的值"数据.

Bottom line is, I want to read a file, say, via sed/awk/match command. Check for ExtData tag in the line. If matched, Read the text between ExtData tag and /ExtData tag. In this multiline text, extract Name. If Name is from a set of names, then mask its corresponding "Value" data with equal number of 'X'.

请让我知道如何完成上述任务.

Please let me know how to achieve the above task.

更新,输入行实际上可以跨多行.

Update, the input line can actually span over multiple lines.

<ExtData>Name="Jason" 
Value="Special"
    </ExtData>

或者也这样:

<ExtData>
     Name="Jason" 
  Value="Special"
    </ExtData>

谢谢!普尼特(Puneet)

Thanks !! Puneet

推荐答案

要仅用Jason和Jim的名字代替,请尝试:

To make the substitutions only for names Jason and Jim, try:

sed -E '/Jason|Jim/{:a; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' file.xml

此命令已在GNU sed上进行了测试.对于BSD/OSX sed,需要进行一些小的更改.

This command was tested on GNU sed. For BSD/OSX sed, some minor changes would be needed.

让我们考虑一下这个测试文件:

Let's consider this test file:

$ cat file.xml
<ExtData>Name="Jason" Value="Special"</ExtData>
<ExtData>Name="DummyName" Value="Garbage"</ExtData>
<ExtData>Name="Jim"
    Value="OK"
        </ExtData>

现在,让我们运行命令:

Now, let's run our command:

$ sed -E '/Jason|Jim/{:a; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' file.xml
<ExtData>Name="Jason" Value="XXXXXXX"</ExtData>
<ExtData>Name="DummyName" Value="Garbage"</ExtData>
<ExtData>Name="Jim"
    Value="XX"
        </ExtData>

工作原理

  • -E

    这告诉set使用扩展的正则表达式.

    This tells set to use extended regular expressions.

    /Jason|Jim/{...}

    这告诉sed仅对包含Jason或Jim的行运行大括号内的命令.花括号内的命令分为两部分:

    This tells sed to run the commands inside the curly braces only for lines that contain Jason or Jim. The command insides the braces breaks down into two parts:

    1. :a; /Value=/bb; n; ba;

    第一部分读取行,直到找到包含Value=的行.更详细地,:a定义标签a.如果当前行包含Value=,则/Value=/bb分支到标签b.如果不是,我们打印出当前行,并使用n命令读入下一行.然后,我们将(b)分支回到标签a.

    The first part reads lines until we find one that contains Value=. In more detail, :a defines a label a. /Value=/bb branches to label b if the current line contains Value=. If it doesn't, we print out the current line and read in the next one using the n command. We then branch (b) back to label a.

    :b; s/(Value="X*)[^X"]/\1X/; tb;

    这将根据需要用尽可能多的X替换该值.

    This replaces the value with as many X as we need.

    更详细地,:b定义标签b. s/(Value="X*)[^X"]/\1X/替代我们在Value=之后需要的下一个X.如果进行了替换(意味着需要另一个X),则测试命令(t)告诉sed跳回到标签b,然后重试.

    In more detail, :b defines a label b. s/(Value="X*)[^X"]/\1X/ substitutes in the next X that we need after Value=. If a substitution was made (meaning that another X was needed), then the test command (t) tells sed to jump back to label b and we try again.

  • 让我们考虑一下这个更复杂的测试文件:

    Let's consider this more complex test file:

    $ cat file2.xml
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="Jason" Value="Special"</ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="DummyName" Value="DontChange"</ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="Jim"
        Value="OK"
            </ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    

    要在ExtData标签而不是其他标签中进行更改,请尝试:

    To make the changes in ExtData tags but not the other tags, try:

    $ sed -E '/[<]ExtData[>]/{:a; /Name=/{/Name="(Jason|Jim)"/!b}; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' file2.xml
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="Jason" Value="XXXXXXX"</ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="DummyName" Value="DontChange"</ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    <ExtData>Name="Jim"
        Value="XX"
            </ExtData>
    <Misc>Name="Jason" Value="DontChange"</Misc>
    

    要使用名称的shell变量来完成上述操作,

    To do the above using a shell variable for the names:

    names='Jason|Jim'
    sed -E '/[<]ExtData[>]/{:a; /Name=/{/Name="'"$names"'"/!b}; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' file2.xml
    

    这会将shell变量直接替换为sed命令.仅当您信任shell变量的源时,才可以这样做.

    This substitutes the shell variable directly into the sed command. This should only be done this way if you trust the source of the shell variable.

    这篇关于解析行并在适当位置更改一些文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆