DOM解析器突出显示关键字不工作 [英] DOM Parser to highlight keywords not working

查看:106
本文介绍了DOM解析器突出显示关键字不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题与之前的 相关,但因为主题是现在已经关闭了,我需要再问一些问题,希望没有问题。

This question is related with one I have made before but because the topic is now closed and I need to ask something further I will start a new question by hoping that's fine.

在我以前的答案中,我简化了问题,导致简单但不是全面的解决方案。在实现我的代码时,我意识到这一点。

In my previous answer I simplified the problem enough and resulted in simple but not fully working solutions. I realized it these days when I was implementing my code.

上一篇文章中的解决方案的问题是HTML标签被替换功能打破了。我已经阅读了这个网站的许多帖子,我需要使用DOM解析器。我非常陌生,我尝试过这个帖子中的用户ircmaxell建议的代码/ a>,但它对我来说不起作用。

The problem with the solutions in the previous post is that the HTML tags are broken by the replacing functions. I have read in many posts of this site that I need to use a DOM Parser. I am very unfamiliar with this and I tried the code suggested by the user "ircmaxell" in this post, but it does not work for me.

以下是我做过的示例:

echo '<style type="text/css">
       .ht{
         background-color: yellow;
       }
     </style>'; 


/* taken from user ircmaxell at https://stackoverflow.com/questions/4081372/highlight-keywords-in-a-paragraph

I just modified line $highlight->setAttribute('class', 'highlight') to $highlight->setAttribute('class', 'ht') and commented the first 2 lines   */

function highlight_paragraph($string, $keyword) {
  //$string = '<p>foo<b>bar</b></p>';
  //$keyword = 'foo';
  $dom = new DomDocument();
  $dom->loadHtml($string);
  $xpath = new DomXpath($dom);
  $elements = $xpath->query('//*[contains(.,"'.$keyword.'")]');
  foreach ($elements as $element) {
   foreach ($element->childNodes as $child) {
     if (!$child instanceof DomText) continue;
     $fragment = $dom->createDocumentFragment();
     $text = $child->textContent;
     $stubs = array();
     while (($pos = stripos($text, $keyword)) !== false) {
       $fragment->appendChild(new DomText(substr($text, 0, $pos)));
       $word = substr($text, $pos, strlen($keyword));
       $highlight = $dom->createElement('span');
       $highlight->appendChild(new DomText($word));
       $highlight->setAttribute('class', 'ht');
       $fragment->appendChild($highlight);
       $text = substr($text, $pos + strlen($keyword));
     }
     if (!empty($text)) $fragment->appendChild(new DomText($text));
     $element->replaceChild($fragment, $child);
   }
 }
 $string = $dom->saveXml($dom->getElementsByTagName('body')->item(0)->firstChild);
 return $string;
}


$string = '<p>This book has been written against a background of both reckless optimism and reckless despair.</p>
<p>It holds that Progress and Doom are two sides of the same medal; that both are articles of superstition, not of faith. It was written out of the conviction that it should be possible to discover the hidden mechanics by which all traditional elements of our political and spiritual world were dissolved into a conglomeration where everything seems to have lost specific value, and has become unrecognizable for human comprehension, unusable for human purpose.</p>
<p> Hannah Arendt, The Origins of Totalitarianism (New York: Harcourt Brace Jovanovich, Inc., 1973 ed.), p.vii, Preface to the First Edition.</p>';

$keywords = array('This', 'book', 'has', 'been', 'written', 'background', 'reckless', 'optimism', 'despair.', 'holds', 'Progress', 'Doom ', 'two', 'sides', 'medal;', 'articles', 'superstition,', 'faith.', 'lost', 'Arendt,', 'Totalitarianism');

foreach ($keywords as $kw) {
  $string = highlight_paragraph($string, $kw);
}

echo $string;

echo $ string只返回:

echo $string only returns:

This book has been written against a background of both reckless optimism and reckless despair.

只有前两个字,This和book被突出显示。

And only the first two words, 'This' and 'book' are highlighted.

通常,它应该输出所有的初始字符串,并突出显示关键字。

Normally it should have outputted all the initial string with the keywords highlighted.

我在stackoverflow和google中搜索了很多,没有找到一个易于使用的代码来实现我的目的,即使有很多人曾经问过相同的事情。

I have searched a lot in stackoverflow and google and did not find an easy to use code to achieve my purpose even if there are lots of people that have asked the same thing before.

我真的需要一个帮助。提前致谢!

I really need a help over here. Thanks in advance!

推荐答案

我很幸运,当我看到这个问题时,我非常无聊。 ;)

You are lucky that I was very bored when I saw this question. ;)

您收到的代码似乎没有被测试 - 我不知道它可能如何工作正常。无论如何,我修复了所有的问题,并提供一个工作版本 - 在我本地安装的Apache Server与PHP 5.3测试:

The code you received as an answer didn't seem to have been tested - I don't know how it could have possibly worked correctly. Anyway, I fixed all the problems and present you a working version - tested on my locally installed Apache Server with PHP 5.3:

function highlight_paragraph($string, $keyword) {
  $dom = new DOMDocument();
  $dom->loadHtml($string);

  // Search for all text blocks containing the keyword
  $xpath = new DOMXpath($dom);
  $textNodes = $xpath->query('//*[contains(.,"'.$keyword.'")]/text()');

  foreach ($textNodes as $textNode) {
    $fragment = $dom->createDocumentFragment();
    $text = $textNode->nodeValue;
    $stubs = array();

    while (($pos = stripos($text, $keyword)) !== false) {
      $fragment->appendChild(new DOMText(substr($text, 0, $pos)));
      $word = substr($text, $pos, strlen($keyword));

      $highlight = $dom->createElement('span');
      $highlight->appendChild(new DOMText($word));
      $highlight->setAttribute('class', 'ht');
      $fragment->appendChild($highlight);

      $text = substr($text, $pos + strlen($keyword));
    }

    if (!empty($text))
      $fragment->appendChild(new DOMText($text));

    $textNode->parentNode->replaceChild($fragment, $textNode);
 }

 return $dom->saveHTML();
}

这篇关于DOM解析器突出显示关键字不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆