结合使用Ruby和Mechanize登录网站 [英] Using Ruby with Mechanize to log into a website

查看:65
本文介绍了结合使用Ruby和Mechanize登录网站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从网站上抓取数据,但是这需要我先登录.我一直在使用hpricot来成功地抓取其他站点,但是我对使用机械化是陌生的,而且我对如何使用它很困惑.

I need to scrape data from a site, but it requires my login first. I've been using hpricot to successfully scrape other sites, but I'm new to using mechanize, and I'm truly baffled by how to work it.

我看到这个例子通常被引用:

I see this example commonly quoted:

require 'rubygems'
require 'mechanize'

a = Mechanize.new
a.get('http://rubyforge.org/') do |page|
  # Click the login link
  login_page = a.click(page.link_with(:text => /Log In/))

  # Submit the login form
  my_page = login_page.form_with(:action => '/account/login.php') do |f|
    f.form_loginname  = ARGV[0]
    f.form_pw         = ARGV[1]
  end.click_button

  my_page.links.each do |link|
    text = link.text.strip
    next unless text.length > 0
    puts text
  end
end

但是我发现它非常神秘.我特别不了解的部分是这里发生的事情:

But I've found it extremely cryptic. The part I don't understand in particular is what's going on here:

f.form_loginname  = ARGV[0]
f.form_pw         = ARGV[1]

页面中的这些输入标签如何突然变成方法?我失去了一些东西在这里?当我尝试重新创建它时,登录到AppDataPro(http://www.appdata.com/login)时,我遇到了输入名称包含方括号的问题,例如:

How have those input tags from the page suddenly become methods? Am I missing something here? When I try to recreate it, to login to AppDataPro (http://www.appdata.com/login) I run into the problem that the input name contains brackets, like this:

<Table> 
<tr><td width="150"> 
   <label for="user_session_username">Username</label><br /> 
</td><td > 
    <input id="user_session_username" name="user_session[username]" size="30" type="text" /> 
</td></tr> 
<tr><td> 
   <label for="user_session_password">Password</label><br /> 
</td><td> 
    <input id="user_session_password" name="user_session[password]" size="30" type="password" /> 
</td></tr> 
</table> 

这是我试图使用机械化:

This is my attempt to use mechanize:

    a = Mechanize.new
    a.get('http://www.appdata.com/login') do |page|
        # Click the login link
        login_page = a.click(page.link_with(:text => /Login/)) #login_page is basically a doc of appdata/login

        my_page = login_page.form_with(:action => '/login') do |f|
            f.user_session[username] =  '****username here?****'
            f.user_session[password] =  '****password here?****'
        end

    end

但它会导致错误

logintest01.rb:21:in `block (2 levels) in <main>': undefined method `user_session' for nil:NilClass (NoMethodError)

我在做什么错了?

推荐答案

这是我通常采用的方法.它并没有让我失望:

This is the approach I usually take. It hasn't failed me:

username_field = form.field_with(:name => "user_session[username]")
username_field.value = "whatever_user"
password_field = form.field_with(:name => "user_session[password]")
password_field.value = "whatever_pwd"
form.submit

这篇关于结合使用Ruby和Mechanize登录网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆