如何集成这两个条件块代码以在Ruby中进行挖掘? [英] How do I integrate these two conditions block codes to mine in Ruby?

查看:68
本文介绍了如何集成这两个条件块代码以在Ruby中进行挖掘?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我的代码没有这些条件,我该如何整合这两个条件?我的代码已经可以正常工作,但是它会刮擦所有行(非粗体和粗体值),并且不会刮除title属性字符串.

How do I integrate these two conditions if my code scrapes without them? My code is working already, but it scrapes all rows (non-bold and bold values) and doesn't scrape the title attribute string.

条件1:

Condition 1: parses a table row only if one of its fields is bold:

doc = Nokogiri::HTML(html)
doc.xpath('//table[@class="articulos"]/tr[td[5]/p/b]').each do |row|
puts row.at_xpath('td[3]/text()')
end


条件2:


Condition2: gets only the number off the title attribute string :

doc     = Nokogiri::HTML(html)
numbers = doc.xpath('//p[@title]').collect { |p| p[:title].gsub(/[^\d]/, '') }


我的代码:

doc = Nokogiri::HTML(search_result.body)
rows = doc.css("table.articulos tr")
i = 0
details = rows.each do |row|
  detail = {}  
  [
    [:sku, 'td[3]/text()'],
    [:desc, 'td[4]/text()'],
    [:qty, 'td[5]/text()'],
    [:qty2, 'td[5]/p/b/text()'],
    [:price, 'td[6]/text()']
  ].each do |name, xpath|
    detail[name] = row.at_xpath(xpath).to_s.strip
  end
  i = i + 1
  detail
end

第二次尝试:

  doc = Nokogiri::HTML(search_result.body)
  rows = doc.xpath('//table[@class="articulos"]/tr[td[5]/p/b]')
  i = 0
  details = rows.each do |row|
    detail = {}  
    [
      [:sku, 'td[3]/text()'],
      [:desc, 'td[4]/text()'],
      [:stock, "td[5]/p[@title]"],
      [:price, 'td[6]/text()']
    ].each do |name, xpath|
        detail[name] = row.at_xpath(xpath).to_s.strip

      end
    i = i + 1
    if detail[:sku] != ""
          price = detail[:price].split

          if price[1] == "D"
              currency = 144
          else
              currency = 168
          end
          stock = detail[:stock].gsub(/[^\d]/, '-')
          cost = price[0].gsub(",", "").to_f
  end

库存而不是仅刮擦标题字符串,而是刮擦整个段落

stock instead of just scraping the title string it scrapes the whole paragraph

<p-style="margin-top:-0px;-margin-bottom:0px;-cursor:hand"-title="2-en-su-sucursal"><b>10</b></p>

当我只想从title属性中获得2个

when I only want 2 from the title attribute

推荐答案

这是我的工作代码.也许需要清洁一点,但是可以用.结果是正确的,但我得到了很多建议.

Here is my working code. maybe needs a little bit of cleaning but it works. the results are correct but I get a lot of nils.

doc = Nokogiri::HTML(search_result.body)
rows = doc.xpath('//table[@class="articulos"]/tr[td[5]/p/b]')
i = 0
details = rows.each do |row|
  detail = {}  
  [
    [:sku, 'td[3]/text()'],
    [:desc, 'td[4]/text()'],
    [:stock, "td[5]/p/@title"],
    [:price, 'td[6]/text()']
  ].each do |name, xpath|
      detail[name] = row.at_xpath(xpath).to_s.strip
    end
  i = i + 1
  if detail[:sku] != ""
        price = detail[:price].split

        if price[1] == "D"
            currency = 144
        else
            currency = 168
        end
        stock = detail[:stock].each do |anchor|
                puts anchor['title']
                end
        stock1 = stock.gsub(/[^\d]/, '')
        cost = price[0].gsub(",", "").to_f
end

这篇关于如何集成这两个条件块代码以在Ruby中进行挖掘?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆