Python POST请求失败,[Errno 10054]远程主机强行关闭了现有连接 [英] Python POST Request Failing, [Errno 10054] An existing connection was forcibly closed by the remote host

查看:813
本文介绍了Python POST请求失败,[Errno 10054]远程主机强行关闭了现有连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Beautiful Soup尝试抓取网页.该代码很好用,但是现在不起作用.我认为问题是,源站点更改了他们的登录页面.因此,我替换了loginurl,但显然无法连接到该URL.我可以直接连接到它.那么有人可以尝试运行此命令并告诉我我做错了什么吗?

I'm using Beautiful Soup to try to scrape a web page. The code worked great but now it is not working. I think the problem is, the source site changed their login page. So I replaced the loginurl and it is apparently not able to connect to that url. I can connect to it directly. So can someone try to run this and tell me what I'm doing wrong?

import requests
from bs4 import BeautifulSoup
import re
import pymysql
import datetime

myurl = 'http://www.cbssports.com'

loginurl = 'https://auth.cbssports.com/login/index'

try:
    response = requests.get(loginurl)
except requests.exceptions.ConnectionError as e:
    print "BAD DOMAIN"

payload = {  
   'dummy::login_form': 1,  
   'form::login_form': 'login_form',  
   'xurl': myurl,  
   'master_product': 150,  
   'vendor': 'cbssports',  
   'userid': 'myuserid',  
   'password': 'mypassword', 
   '_submit': 'Sign in' }

session = requests.session()
p = session.post(loginurl, data=payload)

#(code to scrape the web page)

我收到以下错误: requests.exceptions.ConnectionError:HTTPSConnectionPool(host ='auth.cbssports.com',port = 443):url:/login超过了最大重试次数(由:[Errno 10054]引起,远程主机强行关闭了现有连接)

I get the following error: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='auth.cbssports.com', port=443): Max retries exceeded with url: /login (Caused by : [Errno 10054] An existing connection was forcibly closed by the remote host)

该网站是否正在积极阻止我的自动登录?还是我的数据有效负载有问题?

Is the website actively blocking my automated login? Or do I have something wrong in the data payload?

这是一段简单的代码...

Here's a simpler piece of code...

import requests

myurl = 'http://www.cbssports.com'

loginurl = 'https://auth.cbssports.com/login/index'

try:
    response = requests.get(myurl)
except requests.exceptions.ConnectionError as e:
    print "My URL is BAD"

try:
    response = requests.get(loginurl)
except requests.exceptions.ConnectionError as e:
    print "Login URL is BAD"

请注意,登录URL错误,但主要URL却不是.我可以在浏览器中手动访问两个URL.那么为什么登录页面无法通过Python访问?

Note that the login url is bad but the main one is not. I am able to access both urls manually in a browser. So why is the login page not accessible via Python?

推荐答案

好吧,我不确定为什么这样做,但是我通过简单地将登录地址中的https更改为http来解决了这个问题.就像魔术一样,它奏效了.看来cbs可能是同一页面的不安全版本(?).

Ok I'm not sure why this worked, but I solved this by simply changing the https to http in the login address. And like magic, it worked. It appears that cbs has an unsecure version of the same page maybe (?).

这篇关于Python POST请求失败,[Errno 10054]远程主机强行关闭了现有连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆