关于如何构建HTML Diff工具的建议? [英] Suggestions on how build an HTML Diff tool?

查看:154
本文介绍了关于如何构建HTML Diff工具的建议?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这篇文章我问是否有任何工具比较2个HTML页面的结构(而不​​是实际的内容)。我问,因为我收到我们设计师的HTML模板,并且经常错过我的实现中的次要格式更改。然后,我浪费了几个小时的设计师时间筛选我的页面,找到我的错误。



线程提供了一些好的建议,但没有什么符合账单。 好的,那么,以为我,我会自己动手一下,我是一个半途而然的开发者,对吧?



我开始考虑一下,我不知道该怎么做。我可以轻松地启动一个数据驱动的网站,或者做一个CMS的实现,或者把文件放入和移出BizTalk整天。不能开始弄清楚如何比较HTML文档。



嗯,当然,我必须阅读DOM,并遍历节点。我必须将结构映射到一些数据结构(如何??),然后比较它们(如何??)。这是一个像我曾经尝试过的开发任务。



所以现在我已经确定了我的知识的一个弱点,我甚至更加挑战。关于如何开始的任何建议?



澄清:实际的内容不是我想要比较的 - 创意人填写他们的使用 lorem ipsum 的页面,我使用实际内容。相反,我想比较结构:

 
< div class =foo> lorem ipsum< div>

不同于

 

÷< div>


解决方案

>通过以下Perl脚本运行两个文件,然后使用diff -iw来执行不区分大小写的空白忽略差异。

 #! / usr / bin / perl -w 

使用strict;

undef $ /;

我的$ html =< STDIN> ;;

while($ html =〜/ \S /){
if($ html =〜s / ^ \s *< //){
$ html =〜s /^(.*?)> //或死格式错误的HTML;
print< $ 1> \\\
;
} else {
$ html =〜s / ^([^<] +)//;
print(text)\\\
;
}
}


In this post I asked if there were any tools that compare the structure (not actual content) of 2 HTML pages. I ask because I receive HTML templates from our designers, and frequently miss minor formatting changes in my implementation. I then waste a few hours of designer time sifting through my pages to find my mistakes.

The thread offered some good suggestions, but there was nothing that fit the bill. "Fine, then", thought I, "I'll just crank one out myself. I'm a halfway-decent developer, right?".

Well, once I started to think about it, I couldn't quite figure out how to go about it. I can crank out a data-driven website easily enough, or do a CMS implementation, or throw documents in and out of BizTalk all day. Can't begin to figure out how to compare HTML docs.

Well, sure, I have to read the DOM, and iterate through the nodes. I have to map the structure to some data structure (how??), and then compare them (how??). It's a development task like none I've ever attempted.

So now that I've identified a weakness in my knowledge, I'm even more challenged to figure this out. Any suggestions on how to get started?

clarification: the actual content isn't what I want to compare -- the creative guys fill their pages with lorem ipsum, and I use real content. Instead, I want to compare structure:

<div class="foo">lorem ipsum<div>

is different that


<div class="foo">
<p>lorem ipsum<p>
<div>

解决方案

Run both files through the following Perl script, then use diff -iw to do a case-insensitive, whitespace-ignoring diff.

#! /usr/bin/perl -w

use strict;

undef $/;

my $html = <STDIN>;

while ($html =~ /\S/) {
  if ($html =~ s/^\s*<//) {
    $html =~ s/^(.*?)>// or die "malformed HTML";
    print "<$1>\n";
  } else {
    $html =~ s/^([^<]+)//;
    print "(text)\n";
  }
}

这篇关于关于如何构建HTML Diff工具的建议?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆