httplib.InvalidURL:非数字端口: [英] httplib.InvalidURL: nonnumeric port:

查看:88
本文介绍了httplib.InvalidURL:非数字端口:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试执行一个脚本来检查是否存在许多网址:

i'm trying to do a script which check if many urls exists:

import httplib

with open('urls.txt') as urls:
    for url in urls:
        connection = httplib.HTTPConnection(url)
        connection.request("GET")
        response = connection.getresponse()
        if response.status == 200:
            print '[{}]: '.format(url), "Up!"

但是我得到了这个错误:

But I got this error:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    connection = httplib.HTTPConnection(url)
  File "/usr/lib/python2.7/httplib.py", line 693, in __init__
    self._set_hostport(host, port)
  File "/usr/lib/python2.7/httplib.py", line 721, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: '//globo.com/galeria/amazonas/a.html

怎么了?

推荐答案

httplib.HttpConnection在其构造函数中采用远程URL的hostport,而不是整个URL.

httplib.HttpConnection takes the host and port of the remote URL in its constructor, and not the whole URL.

对于您的用例,使用urllib2.urlopen更容易.

For your use case, it's easier to use urllib2.urlopen.

import urllib2

with open('urls.txt') as urls:
    for url in urls:
        try:
            r = urllib2.urlopen(url)
        except urllib2.URLError as e:
            r = e
        if r.code in (200, 401):
            print '[{}]: '.format(url), "Up!"
        elif r.code == 404:
            print '[{}]: '.format(url), "Not Found!" 

这篇关于httplib.InvalidURL:非数字端口:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆