Javascript:在 Chrome(但不是 Firefox)中删除空白字符 [英] Javascript: Whitespace Characters being Removed in Chrome (but not Firefox)

查看:22
本文介绍了Javascript:在 Chrome(但不是 Firefox)中删除空白字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当用锚链接替换匹配的关键字文本时,为什么下面会消除它周围的空白?请注意,此错误仅发生在 Chrome 中,而不会发生在 Firefox 中.

Why would the below eliminate the whitespace around matched keyword text when replacing it with an anchor link? Note, this error only occurs in Chrome, and not firefox.

有关完整上下文,该文件位于:http://seox.org/lbp/lb-core.js

For complete context, the file is located at: http://seox.org/lbp/lb-core.js

要查看运行中的代码(尚未发现错误),演示页面位于 http://seox.org/test.html.将第一段复制/粘贴到富文本编辑器(即:dreamweaver 或打开富文本编辑器的 gmail)将揭示问题,并将单词串在一起.将其粘贴到纯文本编辑器中不会.

To view the code in action (no errors found yet), the demo page is at http://seox.org/test.html. Copy/Pasting the first paragraph into a rich text editor (ie: dreamweaver, or gmail with rich text editor turned on) will reveal the problem, with words bunched together. Pasting it into a plain text editor will not.

// Find page text (not in links) -> doxdesk.com
function findPlainTextExceptInLinks(element, substring, callback) {
    for (var childi= element.childNodes.length; childi-->0;) {
        var child= element.childNodes[childi];
        if (child.nodeType===1) {
            if (child.tagName.toLowerCase()!=='a')
                findPlainTextExceptInLinks(child, substring, callback);
        } else if (child.nodeType===3) {
            var index= child.data.length;
            while (true) {
                index= child.data.lastIndexOf(substring, index);
                if (index===-1 || limit.indexOf(substring.toLowerCase()) !== -1)
                    break;
                // don't match an alphanumeric char
                var dontMatch =/\w/;
                if(child.nodeValue.charAt(index - 1).match(dontMatch) || child.nodeValue.charAt(index+keyword.length).match(dontMatch))
                    break;
                // alert(child.nodeValue.charAt(index+keyword.length + 1));
                callback.call(window, child, index)
            }
        }
    }
}

// Linkup function, call with various type cases (below)
function linkup(node, index) {

    node.splitText(index+keyword.length);
    var a= document.createElement('a');
    a.href= linkUrl;
    a.appendChild(node.splitText(index));
    node.parentNode.insertBefore(a, node.nextSibling);
    limit.push(keyword.toLowerCase()); // Add the keyword to memory
    urlMemory.push(linkUrl); // Add the url to memory
}

// lower case (already applied)
findPlainTextExceptInLinks(lbp.vrs.holder, keyword, linkup);

预先感谢您的帮助.我几乎已准备好启动脚本,并且很乐意为您提供帮助以表敬意.

Thanks in advance for your help. I'm nearly ready to launch the script, and will gladly comment in kudos to you for your assistance.

推荐答案

这与链接功能无关;即使 processSel() 调用被注释掉,它也碰巧复制了页面上已经存在的链接和 credit 内容.

It's not anything to do with the linking functionality; it happens to copied links that are already on the page too, and the credit content, even if the processSel() call is commented out.

这似乎是 Chrome 的富文本复制功能中的一个奇怪的错误.holder 中的内容没问题;如果您 cloneContents 所选范围并在最后提醒其 innerHTML,则空格显然在那里.但是任何内联元素(不仅仅是链接!)的之前、之后和内边缘的空格不会显示在富文本中.

It seems to be a weird bug in Chrome's rich text copy function. The content in the holder is fine; if you cloneContents the selected range and alert its innerHTML at the end, the whitespaces are clearly there. But whitespaces just before, just after, and at the inner edges of any inline element (not just links!) don't show up in rich text.

即使您将新的文本节点添加到包含链接旁边空格的 DOM,Chrome 也会吞下它们.我能够通过插入不间断的空格使其看起来正确:

Even if you add new text nodes to the DOM containing spaces next to a link, Chrome swallows them. I was able to make it look right by inserting non-breaking spaces:

var links= lbp.vrs.holder.getElementsByTagName('a');
for (var i= links.length; i-->0;) {
    links[i].parentNode.insertBefore(document.createTextNode('\xA0 '), links[i]);
    links[i].parentNode.insertBefore(document.createTextNode(' \xA0), links[i].nextSibling);
}

但这很丑陋,应该是不必要的,并且不会修复其他内联元素.糟糕的 Chrome!

but that's pretty ugly, should be unnecessary, and doesn't fix up other inline elements. Bad Chrome!

var keyword = links[i].innerHTML.toLowerCase();

依赖 innerHTML 从元素中获取文本是不明智的,因为浏览器可能会转义或不转义其中的字符.最值得注意的是 &,但无法保证浏览器的 innerHTML 属性将输出哪些字符.

It's unwise to rely on innerHTML to get text from an element, as the browser may escape or not-escape characters in it. Most notably &, but there's no guarantee over what characters the browser's innerHTML property will output.

由于您似乎已经在使用 jQuery,因此请改用 text() 抓取内容.

As you seem to be using jQuery already, grab the content with text() instead.

var isDomain = new RegExp(document.domain, 'g');
if (isDomain.test(linkUrl)) { ...

这将每第二次失败,因为 g lobal regexp 会记住它们以前的状态 (lastIndex):当与像 test,你应该不断重复调用,直到他们返回不匹配.

That'll fail every second time, because g​lobal regexps remember their previous state (lastIndex): when used with methods like test, you're supposed to keep calling repeatedly until they return no match.

您在这里似乎不需要 g(多个匹配项)...但是您在这里似乎不需要正则表达式作为简单的 String indexOf会更可靠.(在正则表达式中,域中的每个 . 将匹配链接中的任何字符.)

You don't seem to need g (multiple matches) here... but then you don't seem to need regexp here either as a simple String indexOf would be more reliable. (In a regexp, each . in the domain would match any character in the link.)

更好的是,使用 Location 上的 URL 分解属性来直接比较主机名,而不是对整个 URL 进行粗略的字符串匹配:

Better still, use the URL decomposition properties on Location to do a direct comparison of hostnames, rather than crude string-matching over the whole URL:

if (location.hostname===links[i].hostname) { ...

// don't match an alphanumeric char
var dontMatch =/\w/;
if(child.nodeValue.charAt(index - 1).match(dontMatch) || child.nodeValue.charAt(index+keyword.length).match(dontMatch))
    break;

如果你想在单词边界上匹配单词,并且不区分大小写,我认为你最好使用正则表达式而不是普通的子字符串匹配.这也可以节省为每个关键字对 findText 进行四次调用,就像现在一样.您可以在 这个答案 并使用它而不是当前的字符串匹配.

If you want to match words on word boundaries, and case insensitively, I think you'd be better off using a regex rather than plain substring matching. That'd also save doing four calls to findText for each keyword as it is at the moment. You can grab the inner bit (in if (child.nodeType==3) { ...) of the function in this answer and use that instead of the current string matching.

从字符串生成正则表达式的烦人之处是在标点符号中添加了大量反斜杠,因此您需要一个函数:

The annoying thing about making regexps from string is adding a load of backslashes to the punctuation, so you'll want a function for that:

// Backslash-escape string for literal use in a RegExp
//
function RegExp_escape(s) {
    return s.replace(/([/\\^$*+?.()|[\]{}])/g, '\\$1')
};

var keywordre= new RegExp('\\b'+RegExp_escape(keyword)+'\\b', 'gi');

为了提高效率,您甚至可以一次性完成所有关键字替换:

You could even do all the keyword replacements in one go for efficiency:

var keywords= [];
var hrefs= [];
for (var i=0; i<links.length; i++) {
    ...
    var text= $(links[i]).text();
    keywords.push('(\\b'+RegExp_escape(text)+'\\b)');
    hrefs.push[text]= links[i].href;
}
var keywordre= new RegExp(keywords.join('|'), 'gi');

然后对于linkup中的每个匹配项,检查哪个匹配组具有非零长度并与相同编号的hrefs[链接.

and then for each match in linkup, check which match group has non-zero length and link with the hrefs[ of the same number.

这篇关于Javascript:在 Chrome(但不是 Firefox)中删除空白字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆