如何验证urllib2脚本以便从Django站点访问HTTPS Web服务? [英] How do I authenticate a urllib2 script in order to access HTTPS web services from a Django site?

查看:217
本文介绍了如何验证urllib2脚本以便从Django站点访问HTTPS Web服务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家。
我正在使用一个django / mod_wsgi / apache2网站,使用https为所有请求和响应提供敏感信息。如果用户未通过身份验证,所有视图都将被写入重定向。它还有几个意图,像RESTful Web服务一样。



我现在正在编写一个使用urllib / urllib2的脚本来联系其中几个服务,以便下载一系列非常大的文件。我尝试登录时遇到403:FORBIDDEN错误的问题。



我用于身份验证和登录的(粗略草案)方法是:

  def login(base_address,username = None,password = None):

#用户名(如果需要),密码
如果用户名==无:
用户名= raw_input('用户名:')
如果密码==无:
密码= getpass.getpass '密码:')
log.info('登录%s'%username)

#获取登录页面以获取csrf令牌
cookieHandler = urllib2。 HTTPCookieProcessor()
opener = urllib2.build_opener(urllib2.HTTPSHandler(),cookieHandler)
urllib2.install_opener(opener)

login_url = base_address + PATH_TO_LOGIN
log。 debug(login_url:+ login_url)
login_page = opener.open(login_url)

#尝试从公司获取csrf令牌okie jar
csrf_cookie =无
在cookieHandler.cookiejar中的cookie:
如果cookie.name =='csrftoken':
csrf_cookie = cookie
break
如果没有cookie:
raise IOError(No csrf cookie found)
log.debug(found csrf cookie:+ str(csrf_cookie))
log.debug(csrf_token = s%csrf_cookie.value)

#使用usr,pwd和csrf令牌登录
login_data = urllib.urlencode(dict(
username = username,password = password,
csrfmiddlewaretoken = csrf_cookie.value))
log.debug(login_data:%s%login_data)

req = urllib2.Request(login_url,login_data)
response = urllib2.urlopen(req)
#< --- 403:FORBIDDEN here

log.debug('response url:\\\
'+ str(response.geturl() )+'\\\
')
log.debug('response info:\\\
'+ str(response.info())+'\\\
')

#应该重定向到欢迎页面,如果返回登录 - 拒绝
如果response.geturl()== login_url:
raise IOError('Authentication refused')

log.info('\t%s登录'%username)
#保存cookies /开启者进一步的操作
return opener

我正在使用HTTPCookieHandler在脚本端存储Django的身份验证Cookie,以便我可以访问Web服务并通过我的重定向。 p>

我知道如果我不通过csrf令牌和登录信息,Django的CSRFmiddleware将会让我失望,所以我先从第一个page / form load的cookiejar。就像我提到的,这与网站的http /开发版本一起使用。



具体来说,当尝试将凭据发布到登录页面/通过https连接形式。此方法适用于使用http连接的开发服务器。



没有Apache目录指令阻止访问该区域(我可以看到)。该脚本成功连接到登录页面,没有发布数据,所以我认为这将使Apache不能解决问题(但我可能是错误的)。



python安装我使用的都是使用SSL编译的。



我也读过urllib2不允许通过代理进行https连接。我对代理不是很有经验,所以我不知道是否使用远程机器的脚本实际上是代理连接,是否会出现问题。这是否导致访问问题?



从我可以看出,问题是Cookie和帖子数据的组合,但我不清楚哪里把它从这里。



任何帮助将不胜感激。谢谢

解决方案

请原谅我回答我自己的问题,但为了记录,这似乎已经解决了:



事实证明,我需要将HTTP Referer标题设置为发出登录信息的请求中的登录页面url。

  req.add_header('Referer',login_url)

原因在 Django CSRF文件 - 具体来说,步骤4。



由于我们在生产方面使用HTTPS的特殊服务器设置,DEBUG = False,我没有看到csrf_failure的原因因为DEBUG信息中通常输出的失败(在这种情况下是引用者检查失败 - 无引用者)。我最终将这个失败的原因打印到Apache error_log和STFW'd上。这导致我 code.djangoproject / ... /csrf.py 和Referer标题修正。


everybody. I'm working on a django/mod_wsgi/apache2 website that serves sensitive information using https for all requests and responses. All views are written to redirect if the user isn't authenticated. It also has several views that are meant to function like RESTful web services.

I'm now in the process of writing a script that uses urllib/urllib2 to contact several of these services in order to download a series of very large files. I'm running into problems with 403: FORBIDDEN errors when attempting to log in.

The (rough-draft) method I'm using for authentication and log in is:

def login( base_address, username=None, password=None ):

    # prompt for the username (if needed), password
    if username == None:
        username = raw_input( 'Username: ' )
    if password == None:
        password = getpass.getpass( 'Password: ' )
    log.info( 'Logging in %s' % username )

    # fetch the login page in order to get the csrf token
    cookieHandler = urllib2.HTTPCookieProcessor()
    opener = urllib2.build_opener( urllib2.HTTPSHandler(), cookieHandler )
    urllib2.install_opener( opener )

    login_url = base_address + PATH_TO_LOGIN
    log.debug( "login_url: " + login_url )
    login_page = opener.open( login_url )

    # attempt to get the csrf token from the cookie jar
    csrf_cookie = None
    for cookie in cookieHandler.cookiejar:
        if cookie.name == 'csrftoken':
             csrf_cookie = cookie
             break
    if not cookie:
        raise IOError( "No csrf cookie found" )
    log.debug(  "found csrf cookie: " + str( csrf_cookie ) )
    log.debug(  "csrf_token = %s" % csrf_cookie.value )

    # login using the usr, pwd, and csrf token
    login_data = urllib.urlencode( dict(
        username=username, password=password,
        csrfmiddlewaretoken=csrf_cookie.value ) )
    log.debug( "login_data: %s" % login_data )

    req = urllib2.Request( login_url, login_data )
    response = urllib2.urlopen( req )
    # <--- 403: FORBIDDEN here

    log.debug( 'response url:\n' + str( response.geturl() ) + '\n' )
    log.debug( 'response info:\n' + str( response.info() ) + '\n' )

    # should redirect to the welcome page here, if back at log in - refused
    if response.geturl() == login_url:
        raise IOError( 'Authentication refused' )

    log.info( '\t%s is logged in' % username )
    # save the cookies/opener for further actions
    return opener 

I'm using the HTTPCookieHandler to store Django's authentication cookies on the script-side so I can access the web services and get through my redirects.

I know that the CSRFmiddleware for Django is going to bump me out if I don't pass the csrf token along with the log in information, so I pull that first from the first page/form load's cookiejar. Like I mentioned, this works with the http/development version of the site.

Specifically, I'm getting a 403 when trying to post the credentials to the login page/form over the https connection. This method works when used on the development server which uses an http connection.

There is no Apache directory directive that prevents access to that area (that I can see). The script connects successfully to the login page without post data so I'm thinking that would leave Apache out of the problem (but I could be wrong).

The python installations I'm using are both compiled with SSL.

I've also read that urllib2 doesn't allow https connections via proxy. I'm not very experienced with proxies, so I don't know if using a script from a remote machine is actually a proxy connection and whether that would be the problem. Is this causing the access problem?

From what I can tell, the problem is in the combination of cookies and the post data, but I'm unclear as to where to take it from here.

Any help would be appreciated. Thanks

解决方案

Please excuse my answering my own question, but - for the record this seems to have solved it:

It turns out I needed to set the HTTP Referer header to the login page url in the request where I post the login information.

req.add_header( 'Referer', login_url )

The reason is explained on the Django CSRF documentation - specifically, step 4.

Due to our somewhat peculiar server setup where we use HTTPS on the production side and DEBUG=False, I wasn't seeing the csrf_failure reason for failure (in this case: 'Referer checking failed - no referer') that is normally output in the DEBUG info. I ended up printing that failure reason to the Apache error_log and STFW'd on it. That lead me to code.djangoproject/.../csrf.py and the Referer header fix.

这篇关于如何验证urllib2脚本以便从Django站点访问HTTPS Web服务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆