如何使用Greasemonkey监视静态HTML页面的更改?使用哈希? [英] How to monitor a static HTML page for changes with Greasemonkey? Use a hash?

查看:165
本文介绍了如何使用Greasemonkey监视静态HTML页面的更改?使用哈希?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望我的Greasemonkey脚本只在它访问的静态页面具有与之前完全相同的内容时运行...

I want my Greasemonkey script to run ONLY when the static page it's accessing has the exact same content as before...

现在我可以设置一个包含此页面哈希的变量。我正在寻找一种方法来动态地对页面进行哈希,这样我就可以将我的哈希值与生成的哈希值进行比较......

Now I have the ability to set a variable containing a hash of this page. I'm looking for a way to hash the page on the fly, so that I can compare my hash to the generated hash...

关于如何实现的任何想法这个,在飞行中,哈希?

Any ideas on how to accomplish this, on the fly, hashing?

推荐答案

从你的问题:

我希望我的Greasemonkey脚本只在它访问的静态页面具有与以前完全相同的内容时运行...

I want my Greasemonkey script to run ONLY when the static page it's accessing has the exact same content as before...

真正 想要的是您的脚本检测页面更改。哈希通常不是最好的方法,见下文。

What you really want is for your script to detect page changes. A hash is not normally the best way to do that, see below.

如果你想检测到 静态 页面已更改,有 3种基本方法。但它们都有不同程度的速度,而不是存储要求,而可信度

If you want to detect that a static page has changed, there are 3 basic methods. But they all have varying degrees of speed, versus storage requirements, versus trustworthiness.

此外,您还需要一个持久存储机制。为此, localStorage < sup> Doc 是完美的。不要使用 GM_setValue(),特别是对于高存储需求方法。

Additionally, you will need a persistent storage mechanism. For this, localStorageDoc is perfect. Do not use GM_setValue(), especially for high storage-requirement methods.

这三种方法是:


  1. 询问服务器。这是最简单,最快速,最简单的方法。服务器会告诉您上次修改页面的时间,并且您的脚本可以使用 document.lastModified 查看此值。

    不幸的是,许多服务器谎言!某些服务器/应用程序(如我所做的那些)仅在页面上的实际有效负载数据发生更改时才更改此字段。但其他网站将始终报告该页面刚刚被修改 - 用于各种各样的,并不总是很好的目的(通常向您展示新鲜广告)。

    所以,除非您监控得很好网站,此方法将报告错误更改。

  1. Ask the server. This is the easiest, fastest, and simplest method. Servers tell you when a page was last modified, and your script can see this value with document.lastModified.
    Unfortunately, many servers lie! Some servers/applications (like the ones I make) only change this field when actual payload data on a page has changed. But other sites will always report that the page has just been modified -- for a variety of, not always nice, purposes (Usually to show you fresh ads).
    So, unless you are monitoring well-behaved sites, this method will report false changes.

直接比较静态HTML 。这种方法不会被说谎的服务器欺骗,并且比尝试计算CRC等要快得多。它实现起来也很简单,但缺点是它可能会占用大量存储空间 - 最终可能成为一个问题。

Directly compare the static HTML. This method won't be fooled by lying servers and is much faster than trying to compute a CRC, etc. It's also simple to implement, but the drawback is that it can use a lot of storage space -- which could eventually become a problem.

例如,此页面的HTML当前(在我发布此答案之前)以45,000个字符计时。一篇冗长的华尔街日报文章的时间为160K,一些博客可能是一两兆字节。

For example, this page's HTML currently (before I post this answer) clocks in at 45,000 characters. A longish WSJ article clocked in at 160K and some blogs might be a megabyte or two.

那么,你是在检查合理数量的合理规模的网站还是一大堆膨胀的网站?

So, are you checking a reasonable number of reasonably-sized sites or a whole lot of bloated sites?

以下完整脚本显示了这种技巧:

// ==UserScript==
// @name     YOUR_SCRIPT_NAME
// @include  http://YOUR_SERVER.COM/YOUR_PATH/*
// ==/UserScript==

var lastPageSource  = localStorage.getItem ('Last Page HTML')  ||  "";

//-- Does the current page match the last stored page?
if (lastPageSource.localeCompare (document.body.innerHTML) ) {
    console.log ("The page has changed.");

    //-- Update the stored value.
    localStorage.setItem ('Last Page HTML', document.body.innerHTML);
}
else {
    console.log ("The page has NOT changed.");

    // DO WHATEVER HERE.
}


计算静态HTML的哈希/校验和/ CRC,然后比较。这种方法不会破坏浏览器的持久性存储,但它可能会减慢页面响应的速度,因为计算这些值需要花费大量的计算。

Compute a hash/checksum/CRC of the static HTML, and compare that. This method doesn't bloat the browser's persistent storage, but it could potentially slow down page response a fair bit, as computing these values takes a a fair bit of calculation.

你只是想知道一个页面是否被更改,使用更快的CRC-32而不是哈希

Since you just want to tell if a page is changed, use the faster CRC-32 rather than a hash.






检查加载页面的CRC,以了解负载之间的变化:



以下使用来自webtoolkit的这个常用功能,稍作修改。

这是一个完整的脚本,它使用CRC-32和 localStorage 检测静态页面更改:

Here is a complete script that uses CRC-32 and localStorage to detect static page changes:

// ==UserScript==
// @name     YOUR_SCRIPT_NAME
// @include  http://YOUR_SERVER.COM/YOUR_PATH/*
// ==/UserScript==

var lastPageCRC = localStorage.getItem ('Last Page CRC')  ||  "";
var currentCRC  = crc32 (document.body.innerHTML).toString ();

console.log ("Last crc: ",     lastPageCRC);
console.log ("Current crc: ",  currentCRC);

//-- Does the current page match the last stored page?
if (lastPageCRC.localeCompare (currentCRC) ) {
    console.log ("The page has changed.");

    //-- Update the stored value.
    localStorage.setItem ('Last Page CRC', currentCRC);
}
else {
    console.log ("The page has NOT changed.");

    // DO WHATEVER HERE.
}

/**
*   Javascript crc32
*   http://www.webtoolkit.info/
*   With slight adjustments (crc init, code order)
**/
function crc32 (str) {
    str         = Utf8Encode (str);
    var table   = "00000000 77073096 EE0E612C 990951BA 076DC419 706AF48F E963A535 9E6495A3 0EDB8832 79DCB8A4 E0D5E91E 97D2D988 09B64C2B 7EB17CBD E7B82D07 90BF1D91 1DB71064 6AB020F2 F3B97148 84BE41DE 1ADAD47D 6DDDE4EB F4D4B551 83D385C7 136C9856 646BA8C0 FD62F97A 8A65C9EC 14015C4F 63066CD9 FA0F3D63 8D080DF5 3B6E20C8 4C69105E D56041E4 A2677172 3C03E4D1 4B04D447 D20D85FD A50AB56B 35B5A8FA 42B2986C DBBBC9D6 ACBCF940 32D86CE3 45DF5C75 DCD60DCF ABD13D59 26D930AC 51DE003A C8D75180 BFD06116 21B4F4B5 56B3C423 CFBA9599 B8BDA50F 2802B89E 5F058808 C60CD9B2 B10BE924 2F6F7C87 58684C11 C1611DAB B6662D3D 76DC4190 01DB7106 98D220BC EFD5102A 71B18589 06B6B51F 9FBFE4A5 E8B8D433 7807C9A2 0F00F934 9609A88E E10E9818 7F6A0DBB 086D3D2D 91646C97 E6635C01 6B6B51F4 1C6C6162 856530D8 F262004E 6C0695ED 1B01A57B 8208F4C1 F50FC457 65B0D9C6 12B7E950 8BBEB8EA FCB9887C 62DD1DDF 15DA2D49 8CD37CF3 FBD44C65 4DB26158 3AB551CE A3BC0074 D4BB30E2 4ADFA541 3DD895D7 A4D1C46D D3D6F4FB 4369E96A 346ED9FC AD678846 DA60B8D0 44042D73 33031DE5 AA0A4C5F DD0D7CC9 5005713C 270241AA BE0B1010 C90C2086 5768B525 206F85B3 B966D409 CE61E49F 5EDEF90E 29D9C998 B0D09822 C7D7A8B4 59B33D17 2EB40D81 B7BD5C3B C0BA6CAD EDB88320 9ABFB3B6 03B6E20C 74B1D29A EAD54739 9DD277AF 04DB2615 73DC1683 E3630B12 94643B84 0D6D6A3E 7A6A5AA8 E40ECF0B 9309FF9D 0A00AE27 7D079EB1 F00F9344 8708A3D2 1E01F268 6906C2FE F762575D 806567CB 196C3671 6E6B06E7 FED41B76 89D32BE0 10DA7A5A 67DD4ACC F9B9DF6F 8EBEEFF9 17B7BE43 60B08ED5 D6D6A3E8 A1D1937E 38D8C2C4 4FDFF252 D1BB67F1 A6BC5767 3FB506DD 48B2364B D80D2BDA AF0A1B4C 36034AF6 41047A60 DF60EFC3 A867DF55 316E8EEF 4669BE79 CB61B38C BC66831A 256FD2A0 5268E236 CC0C7795 BB0B4703 220216B9 5505262F C5BA3BBE B2BD0B28 2BB45A92 5CB36A04 C2D7FFA7 B5D0CF31 2CD99E8B 5BDEAE1D 9B64C2B0 EC63F226 756AA39C 026D930A 9C0906A9 EB0E363F 72076785 05005713 95BF4A82 E2B87A14 7BB12BAE 0CB61B38 92D28E9B E5D5BE0D 7CDCEFB7 0BDBDF21 86D3D2D4 F1D4E242 68DDB3F8 1FDA836E 81BE16CD F6B9265B 6FB077E1 18B74777 88085AE6 FF0F6A70 66063BCA 11010B5C 8F659EFF F862AE69 616BFFD3 166CCF45 A00AE278 D70DD2EE 4E048354 3903B3C2 A7672661 D06016F7 4969474D 3E6E77DB AED16A4A D9D65ADC 40DF0B66 37D83BF0 A9BCAE53 DEBB9EC5 47B2CF7F 30B5FFE9 BDBDF21C CABAC28A 53B39330 24B4A3A6 BAD03605 CDD70693 54DE5729 23D967BF B3667A2E C4614AB8 5D681B02 2A6F2B94 B40BBE37 C30C8EA1 5A05DF1B 2D02EF8D";
    var crc     = 0;
    var x       = 0;
    var y       = 0;

    for (var i = 0, iTop = str.length;  i < iTop;  i++) {
        y   = (crc ^ str.charCodeAt (i) ) & 0xFF;
        x   = "0x" + table.substr (y * 9, 8);
        crc = (crc >>> 8) ^ x;
    }

    return crc ^ (-1);

    function Utf8Encode (string) {
        string = string.replace (/\r\n/g,"\n");
        var utftext = "";
        for (var n = 0; n < string.length; n++) {
            var c = string.charCodeAt (n);
            if (c < 128) {
                utftext += String.fromCharCode (c);
            }
            else if ((c > 127) && (c < 2048)) {
                utftext += String.fromCharCode ((c >> 6) | 192);
                utftext += String.fromCharCode ((c & 63) | 128);
            }
            else {
                utftext += String.fromCharCode ((c >> 12) | 224);
                utftext += String.fromCharCode (( (c >> 6) & 63) | 128);
                utftext += String.fromCharCode ((c & 63) | 128);
            }
        }
        return utftext;
    };
};

这篇关于如何使用Greasemonkey监视静态HTML页面的更改?使用哈希?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆