通过javascript获取网页文本 [英] get web page text via javascript

查看:55
本文介绍了通过javascript获取网页文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一个JavaScript语句可以从网页中检索内容/文本?

Is there a JavaScript statement that will retrieve the contents/text from a web page?

推荐答案

您可以使用 Range s/ TextRange s完成.这样的好处是只获取页面上的可见文本(例如,与非IE浏览器中元素的 textContent 属性不同,这也将为您提供<脚本> 以及其他可能的元素).以下内容适用于所有主流浏览器,尽管我不能保证不同浏览器之间换行符的一致性.

You could do it with Ranges / TextRanges. This has the advantage of only getting the visible text on the page (unlike, for example, the textContent property of elements in non-IE browsers, which will also get you the contents of <script> and possibly other elements). The following will work in all mainstream browsers although I can't make any guarantees about the consistency of line breaks between different browsers.

2012年11月更新

这些天我认为这不是一个好主意.现在指定了选择时,它的 toString()方法不是,并且一段时间(包括Microsoft在IE 9中实现它)被指定为类似于 textContent .对于这种特定方法,自2009年以来,浏览器的一致性一直在变差而不是变好.

I don't think this is a good idea these days. While Selection is now specified, its toString() method is not, and for some time (including when Microsoft were implementing it for IE 9) it was specified to behave like textContent. For this particular method, browser consistency has got worse rather than better since 2009.

function getBodyText(win) {
    var doc = win.document, body = doc.body, selection, range, bodyText;
    if (body.createTextRange) {
        return body.createTextRange().text;
    } else if (win.getSelection) {
        selection = win.getSelection();
        range = doc.createRange();
        range.selectNodeContents(body);
        selection.addRange(range);
        bodyText = selection.toString();
        selection.removeAllRanges();
        return bodyText;
    }
}

alert( getBodyText(window) );

这篇关于通过javascript获取网页文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆