如何“保持活跃”与cookielib和httplib在python? [英] How to "keep-alive" with cookielib and httplib in python?

查看:124
本文介绍了如何“保持活跃”与cookielib和httplib在python?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python中,我使用httplib,因为它保持活动http连接(与urllib(2)相反)。现在,我想使用cookielib与httplib,但他们似乎恨对方! (没有办法将它们接合在一起)。



有人知道解决这个问题吗?

解决方案

您应该考虑使用 请求 library而不是最早的机会,你必须重构你的代码。在同一时间;



HACK ALERT! :)



我会走其他建议的方式,但我做了一个黑客(做了不同的原因,虽然),它创建一个接口 httplib cookielib



我做的是创建一个假的 HTTPRequest 使用最少必需的方法集,以便 CookieJar 会根据需要识别它并处理Cookie。



以下是类的代码:



class HTTPRequest(object):

HTTP请求的数据容器(用于cookie处理)


def __init __(self,host,url,headers = {},secure = False):
self._host = host
self._url = url
self ._secure = secure
self._headers = {}
键,header.items()中的值:
self.add_header(key,value)

def has_header(self,name):
self._headers中的返回名称

def add_header(self,key,val):
self._headers [key.capitalize()] = val

def add_unredirected_header(self,key,val):
self._headers [key.capitalize()] = val

def is_unverifiable b $ b return True

def get_type(self):
返回'https'if self._secure else'http'

def get_full_url(self):
port_str =
port = str(self._host [1])$ ​​b $ b如果self._secure:
如果port!= 443:
port_str =: + port
else:
if port!= 80:
port_str =:+ port
return self.get_type()+'://'+ self._host [ 0] + port_str + self._url

def get_header(self,header_name,default = None):
return self._headers.get(header_name,default)
$ b b def get_host(self):
return self._host [0]

get_origin_req_host = get_host

def get_headers(self):
return self。 _headers

请注意,该类只支持HTTPS协议(目前我只需要)。



使用这个类的代码是(请注意另一个黑客以使响应与cookielib兼容):

  cookies = CookieJar()

headers = {
#要设置的页眉
}

#construct fake request
fake_request = HTTPRequest(host,request_url,headers)

#为假请求添加Cookie
cookies.add_cookie_header(fake_request)

#发出基于httplib.HTTPConnection的请求使用cookie和来自假请求的头
http_connection.request(type,request_url,body,fake_request.get_headers())

response = http_connection.getresponse )

if response.status == httplib.OK:
#HACK:假装我们是urllib2响应
response.info = lambda:response.msg

#从响应中读取和存储cookie
cookies.extract_cookies(response,fake_request)

#process response ...


In python, I'm using httplib because it "keep-alive" the http connection (as oppose to urllib(2)). Now, I want to use cookielib with httplib but they seem to hate each other!! (no way to interface them together).

Does anyone know of a solution to that problem?

解决方案

You should consider using the Requests library instead at the earliest chance you have to refactor your code. In the mean time;

HACK ALERT! :)

I'd go other suggested way, but I've done a hack (done for different reasons though), which does create an interface between httplib and cookielib.

What I did was creating a fake HTTPRequest with minimal required set of methods, so that CookieJar would recognize it and process cookies as needed. I've used that fake request object, setting all the data needed for cookielib.

Here is the code of the class:

class HTTPRequest( object ):
"""
Data container for HTTP request (used for cookie processing).
"""

    def __init__( self, host, url, headers={}, secure=False ):
        self._host = host
        self._url = url
        self._secure = secure
        self._headers = {}
        for key, value in headers.items():
            self.add_header(key, value)

    def has_header( self, name ):
        return name in self._headers

    def add_header( self, key, val ):
        self._headers[key.capitalize()] = val

    def add_unredirected_header(self, key, val):
        self._headers[key.capitalize()] = val

    def is_unverifiable( self ):
        return True

    def get_type( self ):
        return 'https' if self._secure else 'http'

    def get_full_url( self ):
        port_str = ""
        port = str(self._host[1])
        if self._secure:
            if port != 443:
                port_str = ":"+port
        else:
            if port != 80:
                port_str = ":"+port
        return self.get_type() + '://' + self._host[0] + port_str + self._url

    def get_header( self, header_name, default=None ):
        return self._headers.get( header_name, default )

    def get_host( self ):
        return self._host[0]

    get_origin_req_host = get_host

    def get_headers( self ):
        return self._headers

Please note, the class has support for HTTPS protocol only (all I needed at the moment).

The code, which used this class was (please note another hack to make response compatible with cookielib):

cookies = CookieJar()

headers = {
    # headers that you wish to set
}

# construct fake request
fake_request = HTTPRequest( host, request_url, headers )

# add cookies to fake request
cookies.add_cookie_header(fake_request)

# issue an httplib.HTTPConnection based request using cookies and headers from the fake request
http_connection.request(type, request_url, body, fake_request.get_headers())

response = http_connection.getresponse()

if response.status == httplib.OK:
    # HACK: pretend we're urllib2 response
    response.info = lambda : response.msg

    # read and store cookies from response
    cookies.extract_cookies(response, fake_request)

    # process response...

这篇关于如何“保持活跃”与cookielib和httplib在python?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆