首页
前端开发
在浏览器插件中替换大量文本

在浏览器插件中替换大量文本 [英] Replacing a lot of text in browser's addon

查看：266 发布时间：2017/11/20 21:00:13 javascript performance firefox-addon

本文介绍了在浏览器插件中替换大量文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图开发一个Firefox附加组件，将任何页面上的文本音译成特定的语言。其实它只是一个二维数组，我迭代和使用这个代码
$ b $ pre $函数escapeRegExp（str）{
return str.replace（/（[。* +？^ =！：$ {}（）| \ [\] \ / \\\\））/ g，\\ $ 1）;

$ b $ function replaceAll（find，replace）{
return document.body.innerHTML.replace（new RegExp（escapeRegExp（find），'g'），replace）;

函数convert2latin（）{
for（var i = 0; i< Table.length; i ++）{
document.body.innerHTML = replaceAll （表[i] [1]，表[i] [0]）;

$ b $ p $它可以工作，我可以忽略HTML标签它只能用英文，但问题是性能。当然这是非常非常贫穷的。因为我没有经验的JS，我试图谷歌，发现也许documentFragment可以帮助。

也许我应该使用另一种方法呢？

document.body.innerHTML ）。您目前正在为每个替代做这个。这导致Firefox重新呈现您正在制作的每个替换的整个页面。您完成所有的替换后，您只需要分配给 document.body.innerHTML 一次。

下面应该提供一个第一遍，让它更快：

pre $函数escapeRegExp（str）{
return str.replace（ /([**+?^=!:${}()|\[\]\/\\\))/g，\\ $ 1）;

函数convert2latin（）{
newInnerHTML = document.body.innerHTML
for（let i = 0; i< Table.length; i ++）{
newInnerHTML = newInnerHTML.replace（new RegExp（escapeRegExp（Table [i] [1]），'g'），Table [i] [0]）;
}
document.body.innerHTML = newInnerHTML
}

你在评论中提到，没有真正需要使用RegExp进行匹配，所以以下情况会更快：

函数convert2latin（）{ newInnerHTML = document.body.innerHTML for（let i = 0; i< Table.length; i ++）{ newInnerHTML = newInnerHTML.replace（Table [i ] [1]，表[i] [0]）; } document.body.innerHTML = newInnerHTML }
如果您确实需要使用RegExp进行匹配，并且要多次执行这些精确的替换，则最好在首次使用之前创建所有RegExp（例如，当 Table < （code> Table [i] [2] ）。

然而，分配给 document.body.innerHTML 是一个不好的方法：

由于8472上面提到的，替换 document.body.innerHTML 的整个内容是完成这个任务的一个非常重要的方式，它有一些显着的缺点，包括可能会破坏其他JavaScript的功能页面和潜在的安全问题。更好的解决办法是只更改 textContent 。

这样做的一个方法是使用 TreeWalker 。这样做的代码可能是这样的：

$ p $ lt; code> function convert2latin（text）{
for（let i = 0; i< Table.length; i ++）{
text = text.replace（Table [i] [1]，Table [i] [0]）;

return text

$ b $ //创建TreeWalker
let treeWalker = document.createTreeWalker（document.body，NodeFilter.SHOW_TEXT，{
acceptNode：function（node）{
if（node.textContent.length === 0
|| node.parentNode.nodeName ==='SCRIPT'
|| node。 parentNode.nodeName ==='STYLE'
）{
//不要包含0长度，< script>或< style>文本节点
return NodeFilter.FILTER_SKIP;
} // else
return NodeFilter.FILTER_ACCEPT;
}
}，false）;
//在修改DOM之前制作一个节点列表。一旦DOM被修改，TreeWalker
//就可能变成无效（即在第一次修改之后停止）。这样做不需要
//在这种情况下，但是在需要的时候是一个好习惯。
let nodeList = [];
while（treeWalker.nextNode（））{
nodeList.push（treeWalker.currentNode）;

//遍历所有文本节点，改变文本节点的textContent
nodeList.forEach（function（el）{

el.textContent = convert2latin （el.textContent））;
}）;

I'm trying to develop a Firefox add-on that transliterates the text on any page into specific language. Actually it's just a set of 2D arrays which I iterate and use this code
function escapeRegExp(str) { return str.replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1"); } function replaceAll(find, replace) { return document.body.innerHTML.replace(new RegExp(escapeRegExp(find), 'g'), replace); } function convert2latin() { for (var i = 0; i < Table.length; i++) { document.body.innerHTML = replaceAll(Table[i][1], Table[i][0]); } }
It works, and I can ignore HTML tags, as it can be in english only, but the problem is performance. Of course it's very very poor. As I have no experience in JS, I tried to google and found that maybe documentFragment can help.
Maybe I should use another approach at all?
解决方案
Based on your comments, you appear to have already been told that the most expensive thing is the DOM rebuild that happens when you completely replace the entire contents of the page (i.e. when you assign to document.body.innerHTML). You are currently doing that for each substitution. This results in Firefox re-rendering the entire page for each substitution you are making. You only need assign to document.body.innerHTML once, after you have made all of the substitutions.

The following should provide a first pass at making it faster:
function escapeRegExp(str) { return str.replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1"); } function convert2latin() { newInnerHTML = document.body.innerHTML for (let i = 0; i < Table.length; i++) { newInnerHTML = newInnerHTML.replace(new RegExp(escapeRegExp(Table[i][1]), 'g'), Table[i][0]); } document.body.innerHTML = newInnerHTML }
You mention in comments that there is no real need to use a RegExp for the match, so the following would be even faster:
function convert2latin() { newInnerHTML = document.body.innerHTML for (let i = 0; i < Table.length; i++) { newInnerHTML = newInnerHTML.replace(Table[i][1], Table[i][0]); } document.body.innerHTML = newInnerHTML }
If you really need to use a RegExp for the match, and you are going to perform these exact substitutions multiple times, you are better off creating all of the RegExp prior to the first use (e.g. when Table is created/changed) and storing them (e.g. in Table[i][2]).

However, assigning to document.body.innerHTML is a bad way to do this:

As the8472 mentioned, replacing the entire content of document.body.innerHTML is a very heavy handed way to perform this task, which has some significant disadvantages including probably breaking the functionality of other JavaScript in the page and potential security issues. A better solution would be to change only the textContent of the text nodes.

One method of doing this is to use a TreeWalker. The code to do so, could be something like:
function convert2latin(text) { for (let i = 0; i < Table.length; i++) { text = text.replace(Table[i][1], Table[i][0]); } return text } //Create the TreeWalker let treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT,{ acceptNode: function(node) { if(node.textContent.length === 0 || node.parentNode.nodeName === 'SCRIPT' || node.parentNode.nodeName === 'STYLE' ) { //Don't include 0 length, <script>, or <style> text nodes. return NodeFilter.FILTER_SKIP; } //else return NodeFilter.FILTER_ACCEPT; } }, false ); //Make a list of nodes prior to modifying the DOM. Once the DOM is modified the TreeWalker // can become invalid (i.e. stop after the first modification). Doing so is not needed // in this case, but is a good habit for when it is needed. let nodeList=[]; while(treeWalker.nextNode()) { nodeList.push(treeWalker.currentNode); } //Iterate over all text nodes, changing the textContent of the text nodes nodeList.forEach(function(el){ el.textContent = convert2latin(el.textContent)); });

这篇关于在浏览器插件中替换大量文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

相关文章

替换浏览器中显示的所有文本;

浏览器插件;

如何创建浏览器插件以在浏览器中打开文件;

浏览器插件的默认Android浏览器;

Silverlight浏览器插件;

浏览器插件(FireFox);

浏览器插件JSON;

Android浏览器插件;

Android浏览器插件;

在Firefox浏览器中替换的控件;

如何获取 windows 的 IcedTea 浏览器插件(OpenJDK 浏览器 java 插件)?;

如何获取Windows的IcedTea浏览器插件(OpenJDK浏览器Java插件)?;

如何编写浏览器插件？;

如何创建浏览器插件？;

在PHP中处理大量数据而无需浏览器超时;

浏览器插件中的Directx控件;

jQuery插件来模拟浏览器页面缩放/文本缩放?;

在Firefox插件中获取当前的浏览器网址;

网页爬虫 - 浏览器爬虫插件;

JavaScript开发插件的浏览器;

跨浏览器嵌入VLC插件;

C＃插件谷歌浏览器;

使用浏览器/插件调试JSX;

如何安装GWT浏览器插件？;

谷歌浏览器插件中的onLaunched错误;

前端开发最新文章

为什么Chrome（在Electron内部）突然重定向到chrome-error：// chromewebdata？;

错误102（net :: ERR_CONNECTION_REFUSED）：服务器拒绝连接;

如何解决'重定向已被CORS策略阻止：没有'Access-Control-Allow-Origin'标题'？;

如何处理“Uncaught（in promise）DOMException：play（）失败，因为用户没有首先与文档交互。”在桌面上使用Chrome 66？;

警告：添加非被动事件侦听器到滚动阻塞'touchstart'事件;

如何在浏览器中播放.TS文件（视频/ MP2T媒体类型）？;

此请求已被阻止;内容必须通过HTTPS提供;

资源解释为样式表，但转换为MIME类型text / html（似乎与web服务器无关）;

通过HTTPS加载页面但请求不安全的XMLHttpRequest端点;

拒绝从执行脚本'*'，因为它的MIME类型（“应用/ JSON'）不是可执行文件，并严格MIME类型检查被启用。;

热门教程

Java教程

Apache ANT 教程

Kali Linux教程

JavaScript教程

JavaFx教程

MFC 教程

Apache HTTP客户端教程

Microsoft Visio 教程

热门工具

Java 在线工具

C(GCC) 在线工具

PHP 在线工具

C# 在线工具

Python 在线工具

MySQL 在线工具

VB.NET 在线工具

Lua 在线工具

Oracle 在线工具

C++(GCC) 在线工具

Go 在线工具

Fortran 在线工具

登录关闭

扫码关注1秒登录

发送“验证码”获取 | 15天全站免登陆

友情链接： IT屋 Chrome插件谷歌浏览器插件

IT屋 ©2016-2022 琼ICP备2021000895号-1 站点地图站点标签 SiteMap <免责申明> 本站内容来源互联网,如果侵犯您的权益请联系我们删除.