仅当其中一个字段为粗体时才如何解析行? Nokogiri和Ruby [英] how to parse a row only if one of its fields is bold? Nokogiri and Ruby

查看:92
本文介绍了仅当其中一个字段为粗体时才如何解析行? Nokogiri和Ruby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有这段代码收集了我需要的所有产品信息:

so I have this code that collects all product info I need:

  # get main page
  page = agent.get "http://www.site.com.mx/tienda/index.php"

  search_form = page.forms.first

  search_result = agent.submit search_form

  doc = Nokogiri::HTML(search_result.body)

  rows = doc.css("table.articulos tr")

        i = 0
        details = rows.collect do |row|
          detail = {}
          [
            [:sku, 'td[3]/text()'],
            [:desc, 'td[4]/text()'],
            [:qty, 'td[5]/text()'],
            [:qty2, 'td[5]/p/b/text()'],
            [:price, 'td[6]/text()']
          ].collect do |name, xpath|
            detail[name] = row.at_xpath(xpath).to_s.strip
          end
          i = i + 1
          detail
        end

如果仅存在qty2,我需要按照代码(在变量中)的方式收集SKU.

I need to collect SKU as in my code (in a variable) if qty2 exists only.

推荐答案

修改行选择逻辑以仅获取所需的行. 更新:这将获得要做在数量单元格中具有粗体的行:

Modify your row selection logic to get only the rows you want. Update: This will get the rows that do have the bold in the quantity cell:

rows = doc.xpath('//table[@class="articulos"]/tr[td[5]/p/b]')

更新2

下面是一个示例,显示了它的工作原理.

Here's an example showing that this works.

require 'nokogiri'

html = <<__html__
<html>
<table class="articulos">
<tr>
  <td>1</td>
  <td>2</td>
  <td>sku1</td>
  <td>4</td>
  <td>5</td>
  <td>6</td>
</tr>
<tr>
  <td>2-1</td>
  <td>2-2</td>
  <td>sku2</td>
  <td>2-4</td>
  <td><p><b>2-5</b></p></td>
  <td>2-6</td>
</tr>
</table>
</html>
__html__

doc = Nokogiri::HTML(html)
doc.xpath('//table[@class="articulos"]/tr[td[5]/p/b]').each do |row|
  puts row.at_xpath('td[3]/text()')
end

输出:

sku2

这篇关于仅当其中一个字段为粗体时才如何解析行? Nokogiri和Ruby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆