Python urllib2.open连接由对等点错误复位 [英] Python urllib2.open Connection reset by peer error
问题描述
我试图使用python抓取页面
I'm trying to scrape a page using python
问题是,我不断得到Errno54连接重置对等。
The problem is, I keep getting Errno54 Connection reset by peer.
运行此代码时出现错误 -
The error comes when I run this code -
urllib2.urlopen("http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100020629&divisionDisplayName=Stanford&departmentDisplayName=ILAC&courseDisplayName=126§ionDisplayName=01&demoKey=d&purpose=browse")
这是发生在这个pag上的所有网址 - 这是什么问题?
this happens for all the urls on this pag- what is the issue?
推荐答案
$> telnet www.bkstr.com 80
Trying 64.37.224.85...
Connected to www.bkstr.com.
Escape character is '^]'.
GET /webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100020629&divisionDisplayName=Stanford&departmentDisplayName=ILAC&courseDisplayName=126§ionDisplayName=01&demoKey=d&purpose=browse HTTP/1.0
Connection closed by foreign host.
你不会喜欢从python或其他地方获取这个URL。如果它在您的浏览器中工作,那么必须有其他事情发生,如Cookie或身份验证或其他。
You're not going to have any joy fetching that URL from python, or anywhere else. If it works in your browser then there must be something else going on, like cookies or authentication or some such. Or, possibly, the server's broken or they've changed their configuration.
尝试在您从未访问过该网站的浏览器中打开它,以便检查。
Try opening it in a browser that you've never accessed that site in before to check. Then log in and try it again.
编辑:毕竟是cookie:
It was cookies after all:
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
#Need to set a cookie
opener.open("http://www.bkstr.com/")
#Now open the page we want
data = opener.open("http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100020629&divisionDisplayName=Stanford&departmentDisplayName=ILAC&courseDisplayName=126§ionDisplayName=01&demoKey=d&purpose=browse").read()
<确定,但你必须检查它是否你想要的:)
The output looks ok, but you'll have to check that it does what you want :)
这篇关于Python urllib2.open连接由对等点错误复位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!