使用Ruby脚本通过https登录网站 [英] Using a Ruby script to login to a website via https
问题描述
好吧,所以这里是故事:我正在开发一个Ruby应用程序,它将从网站获取数据,并将该数据聚合成XML文件。
Alright, so here's the dealio: I'm working on a Ruby app that'll take data from a website, and aggregate that data into an XML file.
我需要从中获取数据的网站没有我可以使用的任何API,所以我唯一能想到的是登录网站,顺序加载有我需要的数据的页面(在这种情况下, PM;我想存档它们,然后解析返回的HTML。
The website I need to take data from does not have any APIs I can make use of, so the only thing I can think of is to login to the website, sequentially load the pages that have the data I need (in this case, PMs; I want to archive them), and then parse the returned HTML.
问题是,我不知道有什么方法可以编程模拟登录会话。
The problem, though, is that I don't know of any ways to programatically simulate a login session.
任何人都有任何建议,或者知道我可以用来成功登录https页面的任何经过验证的方法,然后以编程方式从网站加载页面使用登录中的临时cookie会话?它不一定是一个只有Ruby的解决方案 - 我只想知道我是如何实现这一点的。如果有帮助,有问题的网站是使用Microsoft的.NET Passport服务作为其登录/会话机制的网站。
Would anyone have any advice, or know of any proven methods that I could use to successfully login to an https page, and then programatically load pages from the site using a temporary cookie session from the login? It doesn't have to be a Ruby-only solution -- I just wanna know how I can actually do this. And if it helps, the website in question is one that uses Microsoft's .NET Passport service as its login/session mechanism.
欢迎任何有关此事的意见。谢谢。
Any input on the matter is welcome. Thanks.
推荐答案
Mechanize
Mechanize是ruby库,它模仿了Web浏览器的行为。您可以单击链接,填写表单并提交。它甚至有历史和记忆饼干。看来你的问题可以在机械化的帮助下轻松解决。
Mechanize
Mechanize is ruby library which imititates the behaviour of a web browser. You can click links, fill out forms und submit them. It even has a history and remebers cookies. It seems your problem could be easily solved with the help of mechanize.
以下例子取自 http://mechanize.rubyforge.org :
require 'rubygems'
require 'mechanize'
a = Mechanize.new
a.get('http://rubyforge.org/') do |page|
# Click the login link
login_page = a.click(page.link_with(:text => /Log In/))
# Submit the login form
my_page = login_page.form_with(:action => '/account/login.php') do |f|
f.form_loginname = ARGV[0]
f.form_pw = ARGV[1]
end.click_button
my_page.links.each do |link|
text = link.text.strip
next unless text.length > 0
puts text
end
end
这篇关于使用Ruby脚本通过https登录网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!