JavaScript RegExp匹配忽略HTML的文本 [英] JavaScript RegExp match text ignoring HTML

查看:151
本文介绍了JavaScript RegExp匹配忽略HTML的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有可能在中匹配狗真的很胖< strong>狗< / strong>真的< em>真的< / em>脂肪!并添加< span class =highlight>什么匹配< / span> 围绕它?

Is it possible to match "the dog is really really fat" in "The <strong>dog</strong> is really <em>really</em> fat!" and add "<span class="highlight">WHAT WAS MATCHED</span>" around it?

我的意思并不是这个,但通常能够搜索忽略HTML的文本,将其保留在最终结果中,只需在上面添加跨度吗?

I don't mean this specifically, but generally be able to search text ignoring HTML, keeping it in the end result, and just add the span above around it all?

编辑


考虑到HTML标记重叠问题,是否可以匹配短语并在每个匹配的单词周围添加范围?这里的问题是,当不在搜索的上下文中时,我不希望dog这个词匹配,在这种情况下,狗真的很胖。


Considering the HTML tag overlapping problem, would it be possible to match a phrase and just add the span around each of the matched words? The problem here is that I don't want the word "dog" matched when it's not in the searched context, in this case, "the dog is really really fat."

推荐答案

更新:

这是一个有效的小提琴,可以满足您的需求。但是,您需要更新 htmlTagRegEx 以处理任何HTML标记的匹配,因为这只是执行简单匹配而不会处理所有情况。

Here is a working fiddle that does what you want. However, you will need to update the htmlTagRegEx to handle matching on any HTML tag, as this just performs a simple match and will not handle all the cases.

http://jsfiddle.net/briguy37/JyL4J /

此外,下面是代码。基本上,它逐个取出html元素,然后在文本中进行替换,在匹配的选择周围添加高亮区域,然后逐个推回html元素。这很难看,但这是我能想到的最简单的方法......

Also, below is the code. Basically, it takes out the html elements one by one, then does a replace in the text to add the highlight span around the matched selection, and then pushes back in the html elements one by one. It's ugly, but it's the easiest way I could think of to get it to work...

function highlightInElement(elementId, text){
    var elementHtml = document.getElementById(elementId).innerHTML;
    var tags = [];
    var tagLocations= [];
    var htmlTagRegEx = /<{1}\/{0,1}\w+>{1}/;

    //Strip the tags from the elementHtml and keep track of them
    var htmlTag;
    while(htmlTag = elementHtml.match(htmlTagRegEx)){
        tagLocations[tagLocations.length] = elementHtml.search(htmlTagRegEx);
        tags[tags.length] = htmlTag;
        elementHtml = elementHtml.replace(htmlTag, '');
    }

    //Search for the text in the stripped html
    var textLocation = elementHtml.search(text);
    if(textLocation){
        //Add the highlight
        var highlightHTMLStart = '<span class="highlight">';
        var highlightHTMLEnd = '</span>';
        elementHtml = elementHtml.replace(text, highlightHTMLStart + text + highlightHTMLEnd);

        //plug back in the HTML tags
        var textEndLocation = textLocation + text.length;
        for(i=tagLocations.length-1; i>=0; i--){
            var location = tagLocations[i];
            if(location > textEndLocation){
                location += highlightHTMLStart.length + highlightHTMLEnd.length;
            } else if(location > textLocation){
                location += highlightHTMLStart.length;
            }
            elementHtml = elementHtml.substring(0,location) + tags[i] + elementHtml.substring(location);
        }
    }

    //Update the innerHTML of the element
    document.getElementById(elementId).innerHTML = elementHtml;
}

这篇关于JavaScript RegExp匹配忽略HTML的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆