lxml XPath position() 不起作用 [英] lxml XPath position() does not work

查看：34 发布时间：2021/10/2 19:39:59 python python-3.x xpath

本文介绍了lxml XPath position() 不起作用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试通过 XPath 抓取页面，但无法按预期工作.

I tried to scrape a page via XPath but I could not get it work as expected.

页面就像，

<tag1>
    <tag2>
          ....
              <div id=article>
                  <p> stuff1 </p>
                  <p> stuff2 </p>
                  <p> ...... </p>
                  <p> stuff30 </p>

我想将 stuff1 到 stuff30 提取为字符串.这是我的 Python 代码片段.

I want to extract stuff1 through stuff30 as string. Here is my Python code snippet.

import lxml.html
import urllib.request

html = urllib.request.urlopen('http://www.something.com/news/blah/').read()
root = lxml.html.fromstring(html)

content = root.xpath('string(//div[@id="article"]/p[position()=>1 and position()<=last()]/.)')

此代码没有返回任何内容.

This code did not return anything.

如果我从 position() 语句重写为单个元素索引，它会起作用.

If I rewrite from position() statement to individual element index, it works.

content = root.xpath('string(//div[@id="article"]/p[25]/.)')

此代码正确返回stuff25.

我不想为此运行 for 循环.我相信有一种方法可以让我的代码与 position() 一起工作，但不确定我的代码有什么问题.

I don't want to run for loop just for this. I believe there is a way to get my code work with position(), but not sure what's wrong in my code.

推荐答案

那是因为你有 position()=>1，应该是 position()>=1

Thats because you have position()=>1, should be position()>=1

content = root.xpath('string(//div[@id="article"]/p[position()>=1 and position()<=last()]/.)')

将内容设置为 stuff1.

will set content to stuff1.

这篇关于lxml XPath position() 不起作用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

lxml XPath position() 不起作用 [英] lxml XPath position() does not work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

lxml XPath position() 不起作用 [英] lxml XPath position() does not work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭