用机械化方式抓取way2sms [英] scraping way2sms with mechanize

查看:96
本文介绍了用机械化方式抓取way2sms的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过抓取way2sms.com发送短信,但无法使用机械化功能登录到way2sms.com.

I am trying to send an sms with by scraping way2sms.com, but I am unable to login into way2sms.com using mechanize.

我正在使用以下代码提交登录表单.

I am using following code to submit the login form.

import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0')]
res=br.open('http://wwwa.way2sms.com/content/prehome.jsp')
link=list(br.links())[5]
res=br.follow_link(link)
br.form = list(br.forms())[0]
br.form.find_control('username').value=USERNAME    #user name
br.form.find_control('password').value=PASSWORD    #password
res=br.submit()

提交表单后,再次收到登录页面.

After submitting the form, again the login page is received.

推荐答案

只需将用户名和密码替换为您的用户名和密码.

Just replace username and password with your username and password.

import mechanize
import cookielib
br = mechanize.Browser()

# Cookie Jar 
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

# Browser options
br.set_handle_equiv(True) 
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)

# Follows refresh 0 but not hangs on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)

# User-Agent
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

url = 'http://site25.way2sms.com/content/index.html?'

#Opening WEbsite
op = br.open(url)

#Selection form
br.select_form(nr=0)

username = 'mobilenumberhere'
password = 'passwordhere'


#Give username and password
br.form['username'] = username
br.form['password'] = password

br.submit()


#To check whether log in Successful or not
if username in br.geturl():
     print "Login Failed" # Go to way2sms and enter wrong details. You will understand this.
else:
     print "Login Successful. You are at ", br.geturl()

这篇关于用机械化方式抓取way2sms的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆