如何在unix中比较和替换不同行中的字符串 [英] How to compare and substitute strings in different lines in unix

查看:26
本文介绍了如何在unix中比较和替换不同行中的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较和替换 unix 中不同行中存在的字符串

I want to compare and substitute strings present in different lines in unix

例如我有一个文件,每行两个字

For example I have a file with two words in each line

<a> <b>
<d> <e>
<b> <c>
<c> <e>

如果任何行的第二个单词与任何其他行的第一个单词匹配,则该行的第二个单词应替换为匹配行的第二个单词,并应迭代直到该行的第二个单词与第一个单词之间没有匹配另一行

If second word of any line matched with first word of any other line then second word of this line should be replaced with second word of matched line and it should iterate until there is no match between second word of the line with first word of another line

我需要这样的结果

<a> <e>
<b> <e>
<c> <e>
<d> <e>

我是 unix 新手,不知道如何实现这一点.任何人都可以提供建议或解释我们如何做到这一点

I am new to unix and not getting any idea how to implement this. Can any one give suggestions or explain how we can do this

推荐答案

这显然是递归下降解决方案的一个例子:

This is VERY clearly a case for a recursive descent solution:

$ cat tst.awk
function descend(node) {return (map[node] in map ? descend(map[node]) : map[node])}
{ map[$1] = $2 }
END { for (key in map) print key, descend(key) }

$ awk -f tst.awk file
<a> <e>
<b> <e>
<c> <e>
<d> <e>

如果在您的输入中无限递归是可能的,这里是一种方法,它将在递归开始之前将最后一个节点打印为第二个字段,并在其旁边放置一个*",以便您知道它正在发生:

If infinite recursion in your input is a possibility, here;s an approach that will print as the 2nd field the last node before the recursion starts and put a "*" next to it so you know it's happening:

$ cat tst.awk
function descend(node,  child, descendant) {
    stack[node]
    child = map[node]
    if (child in map) {
        if (child in stack) {
            descendant = node "*"
        }
        else {
            descendant = descend(child)
        }
    }
    else {
        descendant = child
    }
    delete stack[node]
    return descendant
}
{ map[$1] = $2 }
END { for (key in map) print key, descend(key) }

.

$ cat file
<w> <w>
<x> <y>
<y> <z>
<z> <x>
<a> <b>
<d> <e>
<b> <c>
<c> <e>

$ awk -f tst.awk file
<w> <w>*
<x> <z>*
<y> <x>*
<z> <y>*
<a> <e>
<b> <e>
<c> <e>
<d> <e>

如果您需要输出顺序来匹配输入顺序和/或打印重复行两次,请将脚本的底部 2 行更改为:

If you need the output order to match the input order and/or or to print duplicate lines twice, change the bottom 2 lines of the script to:

{ keys[++numKeys] = $1; map[$1] = $2 }
END {
    for (keyNr=1; keyNr<=numKeys; keyNr++) {
        key = keys[keyNr]
        print key, descend(key)
    }
}

这篇关于如何在unix中比较和替换不同行中的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆