Python urlopen连接异常终止-urlopen错误[Errno 10053] [英] Python urlopen connection aborted - urlopen error [Errno 10053]

查看:270
本文介绍了Python urlopen连接异常终止-urlopen错误[Errno 10053]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些代码使用机械化和beautifulsoup来在网络上抓取一些数据.该代码在测试机上工作正常,但是生产机阻止了该连接.我得到的错误是:

I have some code that uses mechanize and beautifulsoup for web scraping some data. The code works fine on a test machine but the production machine is blocking the connection. The error i get is:

urlopen error [Errno 10053] An established connection was aborted by the software in your host machine

我已经阅读了类似的帖子,但找不到确切的错误.我尝试抓取的站点是HTTPS,但是HTTP站点也发生了相同的错误.我正在使用python 2.6并机械化0.2.4.

I have read through similar posts and I cannot find this exact error. The site I am trying to scrape is HTTPS but I have also had the same error occur with an HTTP site. I am using python 2.6 and mechanize 0.2.4.

这是由于代理还是错误(如错误所述)在我的本地计算机上? 我已经写了机械化使用系统代理的信息:

Is this due to the proxy or, as the error says, something on my local machine?? I've written in for mechanize to use the system's proxy:

br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1')]
br.set_proxies({}) #will use system default proxy
page = br.open(url)
html = page.read()
soup = BeautifulSoup.BeautifulSoup(html)

同样,这一切都可以在我的测试机上运行,​​但是生产机给出了错误10053.

Again, this all works on my test machine, but the production machine gives that Error 10053.

推荐答案

这里的问题是基于主机的IDS阻止了连接.问题解决了.

The issue here was a host based IDS was preventing the connection out. Problem solved.

我将我的python脚本添加到了HIDS例外列表中.例外列表是我允许连接到Internet的文件列表.一旦将其添加到列表中,我就可以使用该脚本获得网络连接,并且没有其他问题.该测试机未安装HIDS客户端,因此这就是为什么我可以说出来的原因.仅供参考,两个都有防火墙,但是只有一个(生产机器)有HIDS.

I added my python script to the HIDS exception list. The exception list was the list of files that I allowed to connect out to the internet. Once it was added to the list, I was able to get network connectivity with the script and had no further problems. The test machine did not have a HIDS client installed so that is why it was allowing me to talk out. FYI, both had firewalls but only one (production machine) had the HIDS.

HIDS代表基于主机的入侵检测系统.如果网络安全团队使HIDS对您不可见,则您可能不知道在哪里可以找到它.另外,即使您找到了它,也将无法禁用它.您可以询问您的安全团队,他们是否可以为您的脚本添加例外.解决HIDS的另一种方法是将脚本构建为exe文件(使用Py2EXE),并将您创建的可执行文件重命名为HIDS例外列表中已存在的可执行文件.将其重命名为您的浏览器是一个很好的选择,因此,如果允许Firefox访问Internet,请将您的exe重命名为firefox.exe.

HIDS stands for Host based Intrusion Detection System. If the network security team has made the HIDS not visible to you, you might not know where to find it. Also, even if you do find it, you will not be able to disable it. You can ask your security team if they can add an exception for your script. Another sneaky way around the HIDS is to build your script into an exe (using Py2EXE) and rename the executable you create to something already on the HIDS exception list. A good one to rename it to would be your browser, so if Firefox is allowed internet access, rename your exe to firefox.exe.

这篇关于Python urlopen连接异常终止-urlopen错误[Errno 10053]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆