Ruby中提供的Web页面抓取gem/工具 [英] Web page scraping gems/tools available in Ruby

查看：103 发布时间：2020/5/4 8:24:58 ruby html-parsing lxml scrape

本文介绍了Ruby中提供的Web页面抓取gem/工具的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用正在处理的Ruby脚本抓取网页.该项目的目的是展示哪种ETF和股票共同基金最符合价值投资理念.

I'm trying to scrape web pages in a Ruby script that I'm working on. The purpose of the project is to show which ETFs and stock mutual funds are most compatible with the value investing philosophy.

一些我想抓取的页面示例是:

Some examples of pages I'd like to scrape are:

http://finance.yahoo.com/q/pr?s=SPY+Profile
http://finance.yahoo.com/q/hl?s=SPY+Holdings
http://www.marketwatch.com/tools/mutual-fund/list/V

您推荐使用哪些针对Ruby的网络抓取工具，为什么?请记住，那里有成千上万的股票基金，所以我使用的任何工具都必须相当快.

What web scraping tools do you recommend for Ruby, and why? Keep in mind that there are thousands of stock funds out there, so any tool I use has to be reasonably quick.

我是Ruby的新手，但是我有使用lxml在Python中抓取网页的经验(

I am new to Ruby, but I have experience using lxml to scrape web pages in Python (https://github.com/jhsu802701/dopplervalueinvesting/blob/master/screen.py). Once the pages on 5000+ stocks are downloaded, lxml can scrape them all in just a few minutes. (I remember trying BeautifulSoup but rejecting it because it was too slow.)

Ruby中提供的Web页面抓取gem/工具 [英] Web page scraping gems/tools available in Ruby

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Ruby中提供的Web页面抓取gem/工具 [英] Web page scraping gems/tools available in Ruby

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭