使用python到asp.net页面发布请求 [英] post request using python to asp.net page
问题描述
我想从 http://www.indiapost.gov.in/pin/废弃PINCODE ,我正在编写以下代码。
i want scrap the PINCODEs from "http://www.indiapost.gov.in/pin/", i am doing with following code written.
import urllib
import urllib2
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Origin': 'http://www.indiapost.gov.in',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17',
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': 'http://www.indiapost.gov.in/pin/',
'Accept-Encoding': 'gzip,deflate,sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3'
}
viewstate = 'JulXDv576ZUXoVOwThQQj4bDuseXWDCZMP0tt+HYkdHOVPbx++G8yMISvTybsnQlNN76EX/...'
eventvalidation = '8xJw9GG8LMh6A/b6/jOWr970cQCHEj95/6ezvXAqkQ/C1At06MdFIy7+iyzh7813e1/3Elx...'
url = 'http://www.indiapost.gov.in/pin/'
formData = (
('__EVENTVALIDATION', eventvalidation),
('__EVENTTARGET',''),
('__EVENTARGUMENT',''),
('__VIEWSTATE', viewstate),
('__VIEWSTATEENCRYPTED',''),
('__EVENTVALIDATION', eventvalidation),
('txt_offname',''),
('ddl_dist','0'),
('txt_dist_on',''),
('ddl_state','2'),
('btn_state','Search'),
('txt_stateon',''),
('hdn_tabchoice','3')
)
from urllib import FancyURLopener
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17'
myopener = MyOpener()
encodedFields = urllib.urlencode(formData)
f = myopener.open(url, encodedFields)
print f.info()
try:
fout = open('tmp.txt', 'w')
except:
print('Could not open output file\n')
fout.writelines(f.readlines())
fout.close()
我收到来自服务器的回复抱歉这个网站遇到了严重问题,请尝试重新加载页面或联系网站管理员。
pl建议我哪里出错..
i am getting response from server as "Sorry this site has encountered a serious problem, please try reloading the page or contact webmaster." pl suggest where i am going wrong..
推荐答案
你从哪里得到的价值 viewstate
和 eventvalidation
?一方面,他们不应该以......结束,你必须省略一些东西。另一方面,它们不应该是硬编码的。
Where did you get the value viewstate
and eventvalidation
? On one hand, they shouldn't end with "...", you must have omitted something. On the other hand, they shouldn't be hard-coded.
一个解决方案是这样的:
One solution is like this:
- 通过URL http://www.indiapost.gov.in/pin/检索页面a>没有任何表单数据
- 解析并检索表单值,如
__ VIEWSTATE
和__ EVENTVALIDATION
(您可以使用 BeautifulSoup )。 - 获取通过添加第2步中的重要表单数据来搜索结果(第二个HTTP请求)。
- Retrieve the page via URL "http://www.indiapost.gov.in/pin/" without any form data
- Parse and retrieve the form values like
__VIEWSTATE
and__EVENTVALIDATION
(you may take use of BeautifulSoup). - Get the search result(second HTTP request) by adding vital form-data from step 2.
更新:
根据上述想法,我会略微修改你的代码以使其正常工作:
According to the above idea, I modify your code slightly to make it work:
import urllib
from bs4 import BeautifulSoup
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Origin': 'http://www.indiapost.gov.in',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17',
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': 'http://www.indiapost.gov.in/pin/',
'Accept-Encoding': 'gzip,deflate,sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3'
}
class MyOpener(urllib.FancyURLopener):
version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17'
myopener = MyOpener()
url = 'http://www.indiapost.gov.in/pin/'
# first HTTP request without form data
f = myopener.open(url)
soup = BeautifulSoup(f)
# parse and retrieve two vital form values
viewstate = soup.select("#__VIEWSTATE")[0]['value']
eventvalidation = soup.select("#__EVENTVALIDATION")[0]['value']
formData = (
('__EVENTVALIDATION', eventvalidation),
('__VIEWSTATE', viewstate),
('__VIEWSTATEENCRYPTED',''),
('txt_offname', ''),
('ddl_dist', '0'),
('txt_dist_on', ''),
('ddl_state','1'),
('btn_state', 'Search'),
('txt_stateon', ''),
('hdn_tabchoice', '1'),
('search_on', 'Search'),
)
encodedFields = urllib.urlencode(formData)
# second HTTP request with form data
f = myopener.open(url, encodedFields)
try:
# actually we'd better use BeautifulSoup once again to
# retrieve results(instead of writing out the whole HTML file)
# Besides, since the result is split into multipages,
# we need send more HTTP requests
fout = open('tmp.html', 'w')
except:
print('Could not open output file\n')
fout.writelines(f.readlines())
fout.close()
这篇关于使用python到asp.net页面发布请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!