用机械化方式抓取way2sms [英] scraping way2sms with mechanize
本文介绍了用机械化方式抓取way2sms的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试通过抓取way2sms.com发送短信,但无法使用机械化功能登录到way2sms.com.
I am trying to send an sms with by scraping way2sms.com, but I am unable to login into way2sms.com using mechanize.
我正在使用以下代码提交登录表单.
I am using following code to submit the login form.
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0')]
res=br.open('http://wwwa.way2sms.com/content/prehome.jsp')
link=list(br.links())[5]
res=br.follow_link(link)
br.form = list(br.forms())[0]
br.form.find_control('username').value=USERNAME #user name
br.form.find_control('password').value=PASSWORD #password
res=br.submit()
提交表单后,再次收到登录页面.
After submitting the form, again the login page is received.
推荐答案
只需将用户名和密码替换为您的用户名和密码.
Just replace username and password with your username and password.
import mechanize
import cookielib
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
# Follows refresh 0 but not hangs on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
# User-Agent
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
url = 'http://site25.way2sms.com/content/index.html?'
#Opening WEbsite
op = br.open(url)
#Selection form
br.select_form(nr=0)
username = 'mobilenumberhere'
password = 'passwordhere'
#Give username and password
br.form['username'] = username
br.form['password'] = password
br.submit()
#To check whether log in Successful or not
if username in br.geturl():
print "Login Failed" # Go to way2sms and enter wrong details. You will understand this.
else:
print "Login Successful. You are at ", br.geturl()
这篇关于用机械化方式抓取way2sms的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文