通过javascript获取网页文本 [英] get web page text via javascript
问题描述
是否有一个JavaScript语句可以从网页中检索内容/文本?
Is there a JavaScript statement that will retrieve the contents/text from a web page?
推荐答案
您可以使用 Range
s/ TextRange
s完成.这样的好处是只获取页面上的可见文本(例如,与非IE浏览器中元素的 textContent
属性不同,这也将为您提供<脚本>
以及其他可能的元素).以下内容适用于所有主流浏览器,尽管我不能保证不同浏览器之间换行符的一致性.
You could do it with Range
s / TextRange
s. This has the advantage of only getting the visible text on the page (unlike, for example, the textContent
property of elements in non-IE browsers, which will also get you the contents of <script>
and possibly other elements). The following will work in all mainstream browsers although I can't make any guarantees about the consistency of line breaks between different browsers.
2012年11月更新
这些天我认为这不是一个好主意.现在指定了选择
时,它的 toString()
方法不是,并且一段时间(包括Microsoft在IE 9中实现它)被指定为类似于 textContent
.对于这种特定方法,自2009年以来,浏览器的一致性一直在变差而不是变好.
I don't think this is a good idea these days. While Selection
is now specified, its toString()
method is not, and for some time (including when Microsoft were implementing it for IE 9) it was specified to behave like textContent
. For this particular method, browser consistency has got worse rather than better since 2009.
function getBodyText(win) {
var doc = win.document, body = doc.body, selection, range, bodyText;
if (body.createTextRange) {
return body.createTextRange().text;
} else if (win.getSelection) {
selection = win.getSelection();
range = doc.createRange();
range.selectNodeContents(body);
selection.addRange(range);
bodyText = selection.toString();
selection.removeAllRanges();
return bodyText;
}
}
alert( getBodyText(window) );
这篇关于通过javascript获取网页文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!