使用机械化从HTML表中提取数据 [英] Extract data from HTML Table with mechanize

查看：82 发布时间：2020/5/8 1:03:03 html ruby-on-rails ruby parsing mechanize

本文介绍了使用机械化从HTML表中提取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

首先，这是示例html表:

First of all, here is the sample html table :

 <tr>
   <td><strong>Kangchenjunga </strong></td>
   <td>8,586m<br /></td>
   <td>28,169ft</td>
   <td><div align="center">Nepal/India </div></td>
   <td>1955; G. Band, J. Brown </td>
 </tr>

ARGV [0]将具有一座山的名称(第一个列)，返回值应为最后一列，即首次爬山的人.

The ARGV[0] will have the name of a mountain ( the first colomn) and the return value should be the last column, the people who climbed the mountain for the first time.

因此，我需要检查整行的第一列是否为ARGV [0]，如果是，那么我应该返回没有日期的最后一列.

So I need to check if the whole rows first column is the ARGV[0], and if it is, then I should return the last column without the date.

require 'mechanize'
p=Mechanize.new.get('www.alpineascents.com/8000m-peaks.asp').body
if p.include?('<strong>'+ARGV[0])
   puts 'ok'
end

我有以下内容，如果我在html文档的正文中有ARGV [0]，则会显示"ok". 如何搜索同一行的最后一列，其中找到了ARGV [0]?

I've got the following, which prints "ok" if I have the ARGV[0] in the body of the html document. How can I search for the last column of the same row, where the ARGV[0] is found?

示例:

<tr>
 <td><strong>GIVE THIS AS A PARAMETER </strong></td>
 <td>SKIP THIS<br /></td>
 <td>SKIP THIS</td>
 <td><div align="center">SKIP THIS</div></td>
 <td>I WANT IT TO RETURN THIS</td>
</tr>

我真的是Ruby新手

推荐答案

更简洁的版本更加依赖XPath的黑魔法:)

More succint version relying more on the black magic of XPath :)

require 'nokogiri'
require 'open-uri'

doc = Nokogiri::HTML(open('http://www.alpineascents.com/8000m-peaks.asp'))
last_td = doc./("//tr[td[strong[text()='#{ARGV[0]}']]]/td[5]")

puts last_td.text.gsub(/.*?;/, '').strip

这篇关于使用机械化从HTML表中提取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用机械化从HTML表中提取数据 [英] Extract data from HTML Table with mechanize

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用机械化从HTML表中提取数据 [英] Extract data from HTML Table with mechanize

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭