Javascript Regex替换html属性中的text not [英] Javascript Regex to replace text NOT in html attributes

查看:121
本文介绍了Javascript Regex替换html属性中的text not的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望Javascript Regex在给定的开头(< span> )和结束标记(即<$ c $)中包含给定的单词列表c>< / span> ),但前提是该单词实际上是页面上的可见文本,而不是html属性内部(例如链接的标题标签,或者内部的< script>< / script> 阻止。

I'd like a Javascript Regex to wrap a given list of of words in a given start (<span>) and end tag (i.e. </span>), but only if the word is actually "visible text" on the page, and not inside of an html attribute (such as a link's title tag, or inside of a <script></script> block.

我创建了一个基本设置的JS小提琴: http://jsfiddle.net/4YCR6/1/

I've created a JS Fiddle with the basics setup: http://jsfiddle.net/4YCR6/1/

推荐答案

HTML过于复杂,无法使用正则表达式进行可靠的解析。

HTML is too complex to reliably parse with a regular expression.

如果您正在寻找在客户端,您可以创建文档片段和/或断开连接的DOM节点(两者都不显示在任何地方)并使用HTML字符串初始化它,然后遍历生成的DOM树并处理文本节点。(或者使用库可以帮助你做到这一点,虽然它实际上非常简单。)

If you're looking to do this client-side, you can create a document fragment and/or disconnected DOM node (neither of which is displayed anywhere) and initialize it with your HTML string, then walk through the resulting DOM tree and process the text nodes. (Or use a library to help you do that, although it's actually quite simple.)

这是一个DOM行走示例。此示例略微比您的问题更简单,因为它只是更新文本,它不会向结构添加新元素(在 span s涉及更新结构),但它应该让你去。关于你最后需要改变什么的说明。

Here's a DOM walking example. This example is slightly simpler than your problem because it just updates the text, it doesn't add new elements to the structure (wrapping parts of the text in spans involves updating the structure), but it should get you going. Notes on what you'll need to change at the end.

var html =
    "<p>This is a test.</p>" +
    "<form><input type='text' value='test value'></form>" +
    "<p class='testing test'>Testing here too</p>";
var frag = document.createDocumentFragment();
var body = document.createElement('body');
var node, next;

// Turn the HTML string into a DOM tree
body.innerHTML = html;

// Walk the dom looking for the given text in text nodes
walk(body);

// Insert the result into the current document via a fragment
node = body.firstChild;
while (node) {
  next = node.nextSibling;
  frag.appendChild(node);
  node = next;
}
document.body.appendChild(frag);

// Our walker function
function walk(node) {
  var child, next;

  switch (node.nodeType) {
    case 1:  // Element
    case 9:  // Document
    case 11: // Document fragment
      child = node.firstChild;
      while (child) {
        next = child.nextSibling;
        walk(child);
        child = next;
      }
      break;
    case 3: // Text node
      handleText(node);
      break;
  }
}

function handleText(textNode) {
  textNode.nodeValue = textNode.nodeValue.replace(/test/gi, "TEST");
}

实例

您需要进行的更改将在 handleText 。具体而言,您不需要更新 nodeValue ,而是需要:

The changes you'll need to make will be in handleText. Specifically, rather than updating nodeValue, you'll need to:


  • 查找 nodeValue 字符串中每个单词开头的索引。

  • 使用 Node#splitText 分割文本节点最多包含三个文本节点(匹配文本前的部分, 匹配文本的部分,以及匹配文本后面的部分)。

  • 使用 document.createElement 创建新的 span (这实际上只是 span = document。 createElement('span'))。

  • 使用 节点#insertBefore 插入新的 span 在第三个文本节点前面(包含匹配文本后面的文本的节点);如果您不需要创建第三个节点,因为您的匹配文本位于文本节点的末尾,只需传入 null 作为 refChild

  • 使用 节点#appendChild 将第二个文本节点(具有匹配文本的节点)移动到跨度。 (无需先将其从父项中删除; appendChild 为您做到这一点。)

  • Find the index of the beginning of each word within the nodeValue string.
  • Use Node#splitText to split the text node into up to three text nodes (the part before your matching text, the part that is your matching text, and the part following your matching text).
  • Use document.createElement to create the new span (this is literally just span = document.createElement('span')).
  • Use Node#insertBefore to insert the new span in front of the third text node (the one containing the text following your matched text); it's okay if you didn't need to create a third node because your matched text was at the end of the text node, just pass in null as the refChild.
  • Use Node#appendChild to move the second text node (the one with the matching text) into the span. (No need to remove it from its parent first; appendChild does that for you.)

这篇关于Javascript Regex替换html属性中的text not的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆