如何从通过XMLHttpRequest接收的HTML页面创建DOM对象? [英] How to create DOM object from html page received over XMLHttpRequest?

查看:70
本文介绍了如何从通过XMLHttpRequest接收的HTML页面创建DOM对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发Chrome扩展程序,因此我对要请求权限的域具有XMLHttpRequest的跨主机权限.

我已经使用XMLHttpRequest并获得了HTML网页(txt/html).我想使用XPath(document.evaluate)从中提取相关位.不幸的是,我无法从返回的html字符串构造DOM对象.

I have used XMLHttpRequest and got an HTML webpage (txt/html). I want to use XPath (document.evaluate) to extract relevant bits from it. Unfortunatly I'm failing to construct a DOM object from the returned string of the html.

var xhr = new XMLHttpRequest();
var name = escape("Sticks N Stones Cap");
xhr.open("GET", "http://items.jellyneo.net/?go=show_items&name="+name+"&name_type=exact", true);
xhr.onreadystatechange = function () {
    if (xhr.readyState == 4) {
    var parser = new DOMParser();
    var xmlDoc = parser.parseFromString(xhr.responseText,"text/xml");
    console.log(xmlDoc);
    }
}

xhr.send();

console.log 用于在Chromium JS控制台中显示调试内容.

console.log is to display debug stuff in Chromium JS console.

在上述JS控制台中.我明白了:

In the said JS console. I get this:

Document
<html>​
<body>​
<parsererror style=​"display:​ block;​ white-space:​ pre;​ border:​ 2px solid #c77;​ padding:​ 0 1em 0 1em;​ margin:​ 1em;​ background-color:​ #fdd;​ color:​ black">​
<h3>​This page contains the following errors:​</h3>​
<div style=​"font-family:​monospace;​font-size:​12px">​error on line 1 at column 60: Space required after the Public Identifier
​</div>​
<h3>​Below is a rendering of the page up to the first error.​</h3>​
</parsererror>​
</body>​
</html>​

那么我应该如何使用XMLHttpRequest->接收HTML->转换为DOM->使用XPath进行横向转换?

So how am I suppose to use XMLHttpRequest -> receive HTML -> convert to DOM -> use XPath to transverse?

我应该使用隐藏的" iframe hack来加载/接收DOM对象吗?

Should I be using the "hidden" iframe hack for loading / receiving DOM object?

推荐答案

DOMParser使DOCTYPE定义令人窒息.如果没有关闭/,任何其他非xhtml标记(例如< link> )也会出错.您可以控制要发送的文件吗?如果不是,最好的选择是将其解析为字符串.使用正则表达式查找所需内容.

The DOMParser is choking on the DOCTYPE definition. It would also error on any other non-xhtml markup such as a <link> without a closing /. Do you have control over the document being sent? If not, your best bet is to parse it as a string. Use regular expressions to find what you are looking for.

编辑:通过将浏览器注入到隐藏的div中,您可以让浏览器为您解析正文内容:

You can get the browser to parse the contents of the body for you by injecting it into a hidden div:

var hidden = document.body.appendChild(document.createElement("div"));
hidden.style.display = "none";
hidden.innerHTML = /<body[^>]*>([\s\S]+)<\/body>/i(xhr.responseText)[1];

现在在 hidden 中进行搜索以找到所需内容:

Now search inside hidden to find what you're looking for:

var myEl = hidden.querySelector("table.foo > tr > td.bar > span.fu");
var myVal = myEl.innerHTML;

这篇关于如何从通过XMLHttpRequest接收的HTML页面创建DOM对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆