桑达/ awk中搜索并替换/插入文件中的文本 [英] Sed/Awk to search and replace/insert text in files

查看:107
本文介绍了桑达/ awk中搜索并替换/插入文件中的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图更新或在一个目录(Linux)的我所有的源文件中插入类似版权所有头几点意见。我的文件不一致,所以他们几个已经有标题,而其他没有它们。我与的sed 想看看前几行并更换。更换我的意思是改变那些已具有最新一期著作权的头文件。

I am trying to update or insert few comments like Copyright headers in to all my source files in a directory (Linux). My files are inconsistent, so that a few of them already have headers while others do not have them at all. I tried with sed to look at the first few lines and replace. Replace I mean change the files which are already having Copyright header with latest one.

sed -e '1,10 s/Copyright/*Copyright*/g' file

不过,这不会,如果它没有找到格局插入。我怎样才能做到这一点?

But, this will not insert if it did not find the pattern. How can I achieve this?

例我发表了或什么,我想实际上取代/插入为多典型的版权标题如下:

Example I provided in comments or what I am trying to actually replace/insert is a multiline typical copyright header as follows

/*
* Copyright 1234 XXXNAME, XYZPlace 
*  text text text text ...........
* blah blah blah */

它可能包含一些特殊的字符也。

It may contain some special characters also.

推荐答案

如果我理解正确的话,你希望:

If I understand correctly, you want to:


  • 找到不带版权声明的文件在第10行,和

  • 添加版权声明对这些文件。

另外,你想:


  • 找到与版权声明的文件在第10行,和

  • 更新通知,您的标准文本。

在我看来,这两个任务可以归结为一组:

It seems to me that these two tasks could be boiled down to a single set:


  • 在第一10行中删除任何现有的版权声明,然后

  • 插入一个新的版权声明到文件中。

如果我们可以安全地假设你把一个评论对你的问题的sampletext缩短版是有效的,并应在被插入,例如,每个文件的第2行,那么下面要实现的第一组要求如果你使用GNU sed的:

If we can safely assume that a shortened version of the sampletext you put in a comment on your question is valid, and should be inserted at, for example, line 2 of each file, then the following should achieve the very first set of requirements if you're using GNU sed:

find . -type f -not -exec grep -q Copyright {} \; -exec sed -i'' '2i/* Copyright */' {} \;

如果你的的运行GNU sed的(即你的FreeBSD或OSX或Solaris等是),让我们知道,因为sed脚本会有所不同。

If you're not running GNU sed (i.e. you're in FreeBSD or OSX or Solaris, etc), let us know, because the sed script will be different.

这是如何工作

找到命令得到以下选项:


  • 型的F 告诉它在文件(而不是目录或设备)只能看看。

  • - 不是反转下列选项。

  • -exec grep的-q版权所有{} \\; 限制搜索到它与任何版权所有(由修改 - 不是

  • -exec SED -i'''2I / *版权* /'{} \\; 插入您的版权声明

  • -type f tells it to look only at files (not directories or devices).
  • -not inverts the following option.
  • -exec grep -q Copyright {} \; limits the search to anything with Copyright in it (modified by -not)
  • -exec sed -i'' '2i/* Copyright */' {} \; inserts your copyright notice.

这解决方案可能会遇到困难,如果你希望你的版权声明的特殊字符,这将是由sed脚本PTED间$ P $。但它回答你的问题。 :)

This solution may run into difficulty if you want your copyright notice to include special characters that would be interpreted by the sed script. But it answers your question. :)

如果相反,我们要处理的修订要求,即先删除现有的版权声明,那么我们可以用两个单行做到这一点:

If instead, we want to handle the revised requirements, i.e. remove existing copyright notices first, then we can do this with two one-liners:

首先,我们删除现有的版权声明。

First, we remove existing copyright notices.

find . -type f -exec sh -c 'head {} | grep -q Copyright' \; -exec sed -ne '10,$ta;/Copyright/d;:a;p' {} \;

这可能是一个有点多余,除非你想递归遍历子目录,其中找到(默认设置)。 sed脚本无助于那些在第一10行没有版权信息的文件,所以下面也应改为正常工作,如果您所有的文件都在一个目录:

This may be a little redundant, unless you want to traverse subdirectories recursively, which find does by default. The sed script does nothing to files that have no Copyright info in the first 10 lines, so the following should also work instead, if all your files are in one directory:

for file in *;do sed -ne '10,$ta;/Copyright/d;:a;p' "$file"; done

接下来,我们在添加新的回来了。

Next, we add new ones back in.

for file in *;do sed -i'' '2i/* Copyright */' "$file"; done

或者,如果你想通过递归子目录做到这一点:

Or, if you want to do this recursively through subdirectories:

find . -type f -exec sed -i'' '2i/* Copyright */' {} \;

最后更新

我不能花就这一个在这之后更多的时间。

I can't spend more time on this one after this.

find . -type f \
  -exec sh -c 'head {} | grep -q Copyright' \; \
  -exec sed -ne '1h;1!H;${;g;s:/\*.*Copyright.*\*/:/* Copyright 1998-2012 */' {} \;

什么

第一个 -exec 搜索在第10行的文件的单词版权。就像我张贴,上面的第一个例子。如果grep的发现有什么,这种情况下返回true。

The first -exec searches for the word "Copyright" in the first 10 lines of the file. Just like the first example I posted, above. If grep finds anything, this condition returns true.

第二个 -exec 做替代。它读取整个文件到的sed的保持缓冲器。然后,当它到达文件的末尾,它(先按g )认为保持缓冲区和(取值)做了多线替换。

The second -exec does the substitution. It reads the entire file into sed's hold buffer. Then when it gets to the end of the file, it (g) considers the hold buffer, and (s) does a multi-line substitution.

请注意,这很可能需要一些调整,如果你有在文件中其他地方的意见也可能无法工作。我不记得GNU的sed是否支持非贪婪的明星。你可以研究的自己。

Note that this may very well require some tuning, and it may not work at all if you have comments elsewhere in the file. I don't recall whether GNU sed supports non-greedy stars. You can research that yourself.

下面是我的测试:

$ printf 'one\n/* Copyright blah blah\n *\n */\ntwo\n' | sed -n '1h;1!H;${;g;s:/\*.*Copyright.*\*/:/* Copyright 1998-2012 */:g;p;}'
one
/* Copyright 1998-2012 */
two

此的的保持现有的版权信息,但至少它解决了多行的问题。

This doesn't maintain your existing Copyright information, but at least it addresses the multi-line issue.

这篇关于桑达/ awk中搜索并替换/插入文件中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆