JavaScript中的DOM解析 [英] DOM parsing in JavaScript

查看:146
本文介绍了JavaScript中的DOM解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

某些背景:

我正在使用JavaScript开发基于网络的移动应用程序。 HTML呈现基于Safari。跨域策略已禁用,因此我可以使用XmlHttpRequests调用其他域。这个想法是解析外部HTML并获取特定元素的文本内容。

过去,我逐行分析文本,找到我需要的行。然后获取标记的内容,该标记是该行的子字符串。这非常麻烦,每次目标html更改时都需要大量维护。

现在,我想将html文本解析为DOM并在其上运行css或xpath查询。

效果很好:

Some background:
I'm developing a web based mobile application using JavaScript. HTML rendering is Safari based. Cross domain policy is disabled, so I can make calls to other domains using XmlHttpRequests. The idea is to parse external HTML and get text content of specific element.
In the past I was parsing the text line by line, finding the line I need. Then get the content of the tag which is a substring of that line. This is very troublesome and requires a lot of maintenance each time the target html changes.
So now I want to parse the html text into DOM and run css or xpath queries on it.
It works well:

$('<div></div>').append(htmlBody).find('#theElementToFind').text()

唯一的问题是,当我使用浏览器将html文本加载到DOM元素中,它将尝试加载所有外部资源(图像,js文件等)。尽管这不会引起任何严重的问题,但我想避免这种情况。

The only problem is that when I use the browser to load html text into DOM element, it will try to load all external resources (images, js files, etc.). Although it isn't causing any serious problem, I would like to avoid that.

现在的问题是:

如何在浏览器不加载外部资源或运行js脚本的情况下将html文本解析为DOM?

我一直在思考的一些想法:

Now the question:
How can I parse html text to DOM without the browser loading external resources, or run js scripts ?
Some ideas I've been thinking about:


  • 使用createDocument调用( document.implementation.createDocument())创建新文档对象,但我不是确保会跳过外部资源的加载。

  • 在JS中使用第三方DOM解析器-我尝试过的唯一一个解析器在处理错误方面非常糟糕

  • 使用iframe创建新文档,以便具有相对路径的外部资源不会在控制台中引发错误

  • creating new document object using createDocument call (document.implementation.createDocument()), but I'm not sure it will skip the loading of external resources.
  • use third party DOM parser in JS - the only one I've tried was very bad with handling errors
  • use iframe to create new document, so that external resources with relative path will not throw an error in console

推荐答案

以下代码似乎很不错:

var doc = document.implementation.createHTMLDocument("");
doc.documentElement.innerHTML = htmlBody;
var text = $(doc).find('#theElementToFind').text();

未加载外部资源,未评估脚本。

external resources aren't loaded, scripts aren't being evaluated.

在这里找到它:
https://stackoverflow.com/a/9251106/95624

来源:
https://developer.mozilla.org/en/DOMParser#DOMParser_HTML_extension_for_other_browsers

这篇关于JavaScript中的DOM解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆