在PHP中不区分大小写的xpath搜索 [英] case insensitive xpath searching in php

查看:78
本文介绍了在PHP中不区分大小写的xpath搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像这样的xml文件:

I have an xml file like this:

<volume name="Early">
<book name="School Years">
<chapter number="1">
<line number="1">Here's the first line with Chicago in it.</line>
<line number="2">Here's a line that talks about Atlanta</line>
<line number="3">Here's a line that says chicagogo </line>
</chapter>
</book>
</volume>

我正在尝试使用PHP进行简单的关键字搜索,以查找单词并显示其所在的行.

I'm trying to do a simple keyword search using PHP that finds the word and displays the line it was in. I have this working

$xml = simplexml_load_file($data);
$keyword = $_GET['keyword'];
$kw=$xml->xpath("//line[contains(text(),'$keyword')]");
...snip...

echo $kw[0]." is the first returned item";

但是,使用此技术,用户必须搜索芝加哥"而不是芝加哥",否则搜索将不会返回任何内容.

However, using this technique, a user must search for 'Chicago' and not 'chicago', or the search will return nothing.

我了解我需要使用翻译功能,但是我所有的尝试和错误都是徒劳的.

I understand I need to use the translate function but all my trial and error has been in vain.

我尝试过:

$upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$lower = "abcdefghijklmnopqrstuvwxyz";
$kw = $xml->xpath("line[contains(text(),'translate('$keyword','$upper','$lower'))]");

,但似乎没有任何效果.有小费吗?

but nothing seems to work. any tips?

推荐答案

如果您选择使用X路径,Gordon建议在XPath中使用PHP函数将被证明更加灵活.但是,与他的回答相反,XPath 1.0中提供了translate字符串函数 ,因此您可以使用它.您的问题是如何.

Gordon's recommendation to use a PHP function from within XPath will prove more flexible should you choose to use that. However, contrary to his answer, the translate string function is available in XPath 1.0 so that means you can use it; your problem is how.

首先,查尔斯在对问题的评论中指出了明显的错别字.然后是您尝试匹配文本值的逻辑.

First, there is the obvious typo that Charles pointed out in his comment to the question. Then there is the logic of how you're trying to match the text values.

在字词形式中,您当前正在问:"文本是否包含关键字的小写形式?".这并不是您真正要问的.而是问"小写文本是否包含小写关键字?"(将双关语翻译为Xpun-land)将是:

In word form, you are currently asking, "does the text contain the lowercase form of the keyword?" This is not really what you want to be asking. Instead, ask, "does the lowercase text contain the lowercase keyword?" Translating (pardon the pun) that back into XPath-land would be:

(注意:为了便于阅读,截断了字母)

//line[contains(translate(text(),'ABC...Z','abc...z'),'chicago')]

上面的小写字母line节点中包含的文本,然后检查它(小写字母的文本)是否包含关键字chicago.

The above lowercases the text contained within the line node then checks that it (the lowercased text) contains the keyword chicago.

现在是强制性代码片段(但实际上,上面的 idea 是您真正需要带回家的东西)

And now for the obligatory code snippet (but really, the above idea is what you really need to take home):

$xml    = simplexml_load_file($data);
$search = strtolower($keyword);
$nodes  = $xml->xpath("//line[contains(translate(text(), 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghjiklmnopqrstuvwxyz'), '$search')]");

echo 'Got ' . count($nodes) . ' matches!' . PHP_EOL;
foreach ($nodes as $node){
   echo $node . PHP_EOL;
}


dijon的评论


Edit after dijon's comment

在foreach内部,您可以访问行号,章号和书名,如下所示.

Inside the foreach, you could access the line number, chapter number and book name like below.

行号-这只是<line>元素上的一个属性,使访问它变得非常容易.使用SimpleXML,有两种方法可以访问它:$node['number']$node->attributes()->number(我更喜欢前者).

Line number -- this is just an attribute on the <line> element which makes accessing it super-easy. There are two ways, with SimpleXML, of accessing it: $node['number'] or $node->attributes()->number (I prefer the former).

章节编号-如您所言,要实现这一点,我们需要遍历树.如果使用的是DOM类,则将有一个方便的$node->parentNode属性,可直接将我们引向<chapter>(因为它是我们<line>的直接祖先). SimpleXML没有这种方便的属性,但是我们可以使用相对的XPath查询来获取它. 父轴使我们可以遍历树.

Chapter number -- to get at this, as you rightly said, we need to traverse up the tree. If we were using the DOM classes, we would have a handy $node->parentNode property leading us directly to the <chapter> (since it is the immediate ancestor to our <line>). SimpleXML does not have such a handy property, but we can use a relative XPath query to get it. The parent axis allows us to traverse up the tree.

由于xpath()返回一个数组,我们可以作弊并使用current()来访问从它返回的数组中的第一个(也是唯一的)项目.然后,只需访问上述的number属性即可.

Since xpath() returns an array we can cheat and use current() to access the first (and only) item in the array returned from it. Then it is just a matter of accessing the number attribute as above.

// In the near future we can use: current(...)['number'] but not yet
$chapter = current($node->xpath('./parent::chapter'))->attributes()->number;

书名-此过程与访问章号的过程相同.来自<line>的相对XPath查询可以利用祖先轴 ./ancestor::book(或./parent:chapter/parent::book).希望您能弄清楚如何访问其name属性.

Book name -- the process for this is the same as that of accessing the chapter number. A relative XPath query from the <line> could make use of the ancestor axis like ./ancestor::book (or ./parent:chapter/parent::book). Hopefully you can figure out how to access its name attribute.

这篇关于在PHP中不区分大小写的xpath搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆