用于解析Google搜索结果的Chrome扩展程序无效 [英] Chrome extension to parse Google results doesn't work

查看:86
本文介绍了用于解析Google搜索结果的Chrome扩展程序无效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用Chrome扩展机制,并试图编写一个可以操纵Google搜索结果的扩展程序(添加评论,屏幕截图,网站图标等)。

<到目前为止,我设法编写了一个使用RegEx将代码添加到链接的代码,并且它工作正常。



问题在于它不适用于Google搜寻结果。
我阅读这里,它发生的原因是页面hasn没有满载;所以我添加了一个'DOMContentLoaded'侦听器,但它并没有帮助。



这是我的代码(内容脚本):

  function parse_google(){
document.body.innerHTML = document.body.innerHTML.replace(
new RegExp(< a href = \ (。*)\(。*)< / a>,g),
< img src = \http://< path-to-img.gif> ; \/>< a href = \$ 1 \$ 2< / a>
);
alert(boooya!);
};
alert(content script:before);
document.addEventListener('DOMContentLoaded',parse_google(),false);
alert(content script:end);

我得到所有警报,但它不适用于Google。为什么?

解决方案

DOMContentLoaded指的是页面的静态HTML,但Google的搜索结果是使用AJAX获取的,还没有触发DOMContentLoaded事件。



您可以使用 MutationObserver ,而不是观察根节点及其后代中的childListDOM变化。

(如果您选择此方法,请 变异摘要库

经过(非常浅的)搜索后,我发现(至少对我来说)Google将结果放在 div id 搜索。以下是执行以下操作的示例扩展程序代码:


  1. 注册MutationObserver以检测插入od div#search 到DOM中。
    注册MutationObserver以检测中的childList更改。 < a> 及其后代。

  2. >节点被添加,函数遍历相关节点并修改链接。 (该脚本忽略了< script> 元素,原因很明显。


此示例扩展只是将链接的文本封装在 ~~ 中,但您可以轻松将其更改为您需要的任何内容。



manifest.json

  {
manifest_version:2 ,
name:Test Extension,
version:0.0,

content_scripts:[{
matches:[
...
*://www.google.gr/*,
*://www.google.com/*
,$ b $
run_at:document_end,
all_frames:false
}],

}

content.js:

  console.log(Injected ...); 

/ * MutationObserver配置数据:监听childList
*指定元素及其后代中的突变* /
var config = {
childList:true,
subtree:true
};
var regex = /<a.*?> [^<]*<\/a>/;
$ b $ *遍历'rootNode'及其后代并修改'< a>'标记* /
函数modifyLinks(rootNode){
var nodes = [rootNode];
while(nodes.length> 0){
var node = nodes.shift();
if(node.tagName ==A){
/ *修改'< a>'元素* /
node.innerHTML =~~+ node.innerHTML + ~~;
} else {
/ *如果当前节点有孩子,则将它们排队以进行进一步的
*处理,忽略任何'< script>'标记。 * /
[] .slice.call(node.children).forEach(function(childNode){
if(childNode.tagName!=SCRIPT){
nodes.push(childNode );
}
});



$ * Observer1:寻找'div.search'* /
var observer1 = new MutationObserver(function(mutations) {
/ *对于'突变'中的每个MutationRecord ... * /
mutations.some(功能(突变){
/ * ...如果节点已添加... * /
if(mutation.addedNodes&&&(mutation.addedNodes.length> 0)){
/ * ...寻找'div#search'* /
var node = node.query.querySelector(div#search);
if(node){
/ *'div#search'found; stop observer 1 and start observer 2 * /
observer1 .disconnect();
observer2.observe(node,config);

if(regex.test(node.innerHTML)){
/ *修改任何'< a> ;'元素已经在当前节点中* /
modifyLinks(node);
}
返回true;
}
}
});
});
$ b $ * Observer2:侦听'< a>'元素插入* /
var observer2 = new MutationObserver(function(mutations){
mutations.forEach(function(mutation ){
if(mutation.addedNodes){
[] .slice.call(mutation.addedNodes).forEach(function(node){
/ * If'node'or its如果(regex.test(node.outerHTML)){
/ * ...做了一些事情* /
modifyLinks(节点);
}
});
}
});
});
$ b $ * *开始观察'div#search'的'body'* /
observer1.observe(document.body,config);


I've been experimenting with the Chrome extensions mechanism, and been trying to write an extension that would manipulate Google results (add comments, screenshots, favicons, etc.)

So far I've managed to write a code that uses a RegEx to add imgs to a link, and it works ok.

The problem is that it doesn't work on Google results. I read here that it happens because the page hasn't fully loaded; so I added a 'DOMContentLoaded' listener but it didn't help.

Here's my code (content script):

function parse_google()  {
document.body.innerHTML = document.body.innerHTML.replace(
        new RegExp("<a href=\"(.*)\"(.*)</a>", "g"),
        "<img src=\"http://<path-to-img.gif>\" /><a href=\"$1\"$2</a>"
    );
alert("boooya!");
};
alert("content script: before");
document.addEventListener('DOMContentLoaded', parse_google(), false);    
alert("content script: end");

I get all "alerts", but it doesn't work for google. Why?

解决方案

"DOMContentLoaded" refers to the static HTML of the page, but Google's search results are fetched using AJAX, thus are not there yet when the "DOMContentLoaded" event is triggered.

You could use a MutationObserver instead, to observe "childList" DOM mutations on a root node and its descendants.
(If you choose this approach the mutation-summary library might come in handy.)

After a (really shallow) search, I found out that (at least for me) Google places its results in a div with id search. Below is the code of a sample extension that does the following:

  1. Registers a MutationObserver to detect the insertion od div#search into the DOM.

  2. Registers a MutationObserver to detect "childList" changes in div#search and its descendants.

  3. Whenever a <a> node is added, a function traverses the relevant nodes and modifies the links. (The script ignored <script> elements for obvious reasons.

This sample extension just encloses the link's text in ~~, but you can easily change it to do whatever you need.

manifest.json:

{
    "manifest_version": 2,
    "name":    "Test Extension",
    "version": "0.0",

    "content_scripts": [{
        "matches": [
            ...
            "*://www.google.gr/*",
            "*://www.google.com/*"
        ],
        "js":         ["content.js"],
        "run_at":     "document_end",
        "all_frames": false
    }],

}

content.js:

console.log("Injected...");

/* MutationObserver configuration data: Listen for "childList"
 * mutations in the specified element and its descendants */
var config = {
    childList: true,
    subtree: true
};
var regex = /<a.*?>[^<]*<\/a>/;

/* Traverse 'rootNode' and its descendants and modify '<a>' tags */
function modifyLinks(rootNode) {
    var nodes = [rootNode];
    while (nodes.length > 0) {
        var node = nodes.shift();
        if (node.tagName == "A") {
            /* Modify the '<a>' element */
            node.innerHTML = "~~" + node.innerHTML + "~~";
        } else {
            /* If the current node has children, queue them for further
             * processing, ignoring any '<script>' tags. */
            [].slice.call(node.children).forEach(function(childNode) {
                if (childNode.tagName != "SCRIPT") {
                    nodes.push(childNode);
                }
            });
        }
    }
}

/* Observer1: Looks for 'div.search' */
var observer1 = new MutationObserver(function(mutations) {
    /* For each MutationRecord in 'mutations'... */
    mutations.some(function(mutation) {
        /* ...if nodes have beed added... */
        if (mutation.addedNodes && (mutation.addedNodes.length > 0)) {
            /* ...look for 'div#search' */
            var node = mutation.target.querySelector("div#search");
            if (node) {
                /* 'div#search' found; stop observer 1 and start observer 2 */
                observer1.disconnect();
                observer2.observe(node, config);

                if (regex.test(node.innerHTML)) {
                    /* Modify any '<a>' elements already in the current node */
                    modifyLinks(node);
                }
                return true;
            }
        }
    });
});

/* Observer2: Listens for '<a>' elements insertion */
var observer2 = new MutationObserver(function(mutations) {
    mutations.forEach(function(mutation) {
        if (mutation.addedNodes) {
            [].slice.call(mutation.addedNodes).forEach(function(node) {
                /* If 'node' or any of its desctants are '<a>'... */
                if (regex.test(node.outerHTML)) {
                    /* ...do something with them */
                    modifyLinks(node);
                }
            });
        }
    });
});

/* Start observing 'body' for 'div#search' */
observer1.observe(document.body, config);

这篇关于用于解析Google搜索结果的Chrome扩展程序无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆