X射线分页文字过滤 [英] X-Ray Paginate filter by text

查看:88
本文介绍了X射线分页文字过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用X射线分页抓取网页.这是一些HTML

I am using x-ray to scrap a webpage with pagination. Here are some HTML

<td align="center" style="font-size: 11pt;">
  <div class="paginate" style="font-size: 11pt;">
    <span class="disabled">Previous</span>
    <span class="current">1</span>
    <a href="link2.html">2</a>
    <a href="link2.html">Next</a>
  </div>
</td>

我想按Next按钮进行抓取.但是该网页示例被其类名刮掉了.

I would like to scrap by the Next button. But the web page example is scraped by it's class name.

x('https://blog.ycombinator.com/', '.post', [{
  title: 'h1 a',
  link: '.article-title@href'
}])
  .paginate('.nav-previous a@href')

我想知道如何通过选择Next按钮中的链接进行分页吗?

I would like to know how can I paginate by choosing the link in the Next button?

谢谢.

推荐答案

按文本过滤

.paginate('.paginate a:contains(Next)@href')

这篇关于X射线分页文字过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆