如何通过搜索和替换验证大量文件？ [英] How can I validate large numbers of files with search and replace?

查看：126 发布时间：2018/6/21 12:34:09 html perl unix omittag

本文介绍了如何通过搜索和替换验证大量文件？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在验证客户端的HTML源代码，并且对于没有Omittag的图像和输入文件，我收到了很多验证错误。我会手动做，但这个客户端字面上有成千上万的文件，有很多的情况下没有。

这个客户端已经验证了一些img标签（无论出于何种原因）。

只是想知道是否有一个unix命令可以运行，以检查是否没有Omittag来添加它。

我已经完成了简单搜索，并用以下命令替换：

  find。 \！ -path'* .svn *'-type f -exec sed -i -n'1h; 1！H; $ {; g; s /< b> /< strong> / g; p}'{} \\ \\;

但从来没有这么大的东西。任何帮助，将不胜感激。

解决方案

请参阅我在顶部的评论问。

假设您使用的是GNU sed，并且您正试图将 / 的尾部>添加到您的标记中以制作XML -compliant < img /> 和< input /> ，然后替换命令中的sed表达式这一点，它应该这样做：'1h; 1！H; $ {; g; s / \（img \ | input \）\（[^>] * [^ /] \）> / \ 1 \ 2 \ /> / g; p;}'

这里是一个简单的测试文件（SO的着色器做了很奇怪的事情）：

$ cat test.html 这是< img标签>没有关闭斜线。这是< img tag />结束斜线。这是<输入标签>没有关闭斜线。并且这里一个< input attrib =1 >跨越多条线。最后一个< input attrib =1/>结束斜线。 $ sed -n'1h; 1！H; $ {; g; s / \（img\ | input\）\（[^>] * [^ /] \ 1 \ 2 \ /> / g; p;}'test.html 这是< img tag />没有关闭斜线。这是< img tag />结束斜线。这是一个< input tag />没有关闭斜线。这里有一个< input attrib =1 />跨越多条线。最后一个< input attrib =1/>结束斜线。
以下是 GNU sed正则表达式语法和缓冲如何工作以进行多行搜索/替换。可以使用 Tidy 之类的东西来清理不良的HTML - 这就是我要做的事情比一些简单的搜索/替换更复杂。 Tidy的选项很快就会变得复杂，所以最好用选择的脚本语言（Python，Perl）编写脚本，它调用 libtidy 并设置所需的任何选项。

I am currently validating a client's HTML Source and I am getting a lot of validation errors for images and input files which do not have the Omittag. I would do it manually but this client literally has thousands of files, with a lot of instances where the is not .

This client has validated some img tags (for whatever reason).

Just wondering if there is a unix command I could run to check to see if the does not have a Omittag to add it.

I have done simple search and replaces with the following command:
find . \! -path '*.svn*' -type f -exec sed -i -n '1h;1!H;${;g;s/<b>/<strong>/g;p}' {} \;
But never something this large. Any help would be appreciated.
解决方案
See questions I asked in comment at top.

Assuming you're using GNU sed, and that you're trying to add the trailing / to your tags to make XML-compliant <img /> and <input />, then replace the sed expression in your command with this one, and it should do the trick: '1h;1!H;${;g;s/$img\|input$$ [^>]*[^/]$>/\1\2\/>/g;p;}'

Here it is on a simple test file (SO's colorizer doing wacky things):
$ cat test.html This is an <img tag> without closing slash. Here is an <img tag /> with closing slash. This is an <input tag > without closing slash. And here one <input attrib="1" > that spans multiple lines. Finally one <input attrib="1" /> with closing slash. $ sed -n '1h;1!H;${;g;s/$img\|input$$ [^>]*[^/]$>/\1\2\/>/g;p;}' test.html This is an <img tag/> without closing slash. Here is an <img tag /> with closing slash. This is an <input tag /> without closing slash. And here one <input attrib="1" /> that spans multiple lines. Finally one <input attrib="1" /> with closing slash.
Here's GNU sed regex syntax and how the buffering works to do multiline search/replace.

Alternately you could use something like Tidy that's designed for sanitizing bad HTML -- that's what I'd do if I were doing anything more complicated than a couple of simple search/replaces. Tidy's options get complicated fast, so it's usually better to write a script in your scripting language of choice (Python, Perl) that calls libtidy and sets whatever options you need.

这篇关于如何通过搜索和替换验证大量文件？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何通过搜索和替换验证大量文件？ [英] How can I validate large numbers of files with search and replace?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

如何通过搜索和替换验证大量文件？ [英] How can I validate large numbers of files with search and replace?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭