正则表达式和xpath查询 [英] regular expressions and xpath query

查看:110
本文介绍了正则表达式和xpath查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码:

 <?php 
$ doc = new DOMDocument;
$ doc-> loadhtml('< html>
< head>
< title>栏,这是一个例子< / title>
< / head> ;
< body>
< h1>最新消息< / h1>
foo< strong> bar< / strong>
< i> foobar< / i>
< / body>
< / html>');


$ xpath = new DOMXPath($ doc);
foreach($ xpath-> query('// * [contains(child :: text(),bar)]')as $ e){
echo $ e-> tagName ,\\\
;

$ / code>

打印

  title 
strong
i

这段代码找到包含单词bar的任何HTML元素,并且它与具有bar的单词匹配,如foobar我想更改查询以仅匹配单词bar,而没有任何前缀或后缀



我认为可以通过更改查询来搜索每个酒吧,该酒吧之前或之前没有收到一封信或者在之前或之后有一个空格

这段代码来自过去的问题这里通过 VolkerK



感谢

解决方案

如果您正在寻找XPath 1.0的bar,那么您将不得不使用函数,XPath 1.0中没有正则表达式。

  $ xpath-> query(// * [
starts-with(。,'bar')or
contains(。 ,'bar')或
('bar'= substring(。,string-length(。) - string-length('bar')+ 1))
]);

基本上这是说找到开头的字符串'bar'或包含'bar'(注意前后的空格)或结尾 - 'bar '(注意,结尾是一个XPath 2.0函数,所以我用前面的> Stackoverflow Answer 。) code>one bar,overThis bar。That bar。 C $ C> '酒吧'。你可以尝试这个包含,而不是:

  contains(translate(。, '。,[]',''),'bar')或

c $ c>'。,[]'为一个''(单个空格)... so one在>变成单杠超过,因此会匹配bar如预期的那样。

I have the following code

        <?php
        $doc = new DOMDocument;
        $doc->loadhtml('<html>
                       <head> 
                        <title>bar , this is an example</title> 
                       </head> 
                       <body> 
                       <h1>latest news</h1>
                       foo <strong>bar</strong> 
                      <i>foobar</i>
                       </body>
                       </html>');


        $xpath = new DOMXPath($doc);
        foreach($xpath->query('//*[contains(child::text(),"bar")]') as $e) {
              echo $e->tagName, "\n";
        }

Prints

       title
       strong
       i

this code finds any HTML element that contains the word "bar" and it matches words that has "bar" like "foobar" I want to change the query to match only the word "bar" without any prefix or postfix

I think it can be solved by changing the query to search for every "bar" that has not got a letter after or before or has a space after or before

this code from a past question here by VolkerK

Thanks

解决方案

If you are looking for just "bar" with XPath 1.0 then you'll have to use a combo of functions, there are no regular expressions in XPath 1.0.

$xpath->query("//*[
                starts-with(., 'bar') or 
                contains(., ' bar ') or  
                ('bar' = substring(.,string-length(.)-string-length('bar')+1))
              ]");

Basically this is saying locate strings that start-with 'bar' or contains ' bar ' (notice the spaces before and after) or ends-with 'bar' (notice that ends-with is an XPath 2.0 function, so I substituted code which emulates that function from a previous Stackoverflow Answer.)

if the contains ' bar ' is not enough, because you may have "one bar, over" or "This bar. That bar." where you may have other punctuation after the 'bar'. You could try this contains instead:

contains(translate(., '.,[]', ' '), ' bar ') or

That translates any '.,[]' to a ' ' (single space)... so "one bar, over" becomes "one bar over", thus would match " bar " as expected.

这篇关于正则表达式和xpath查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆