python http状态码 [英] python http status code

查看:198
本文介绍了python http状态码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用python编写自己的目录破坏器,并在安全可靠的环境中针对我的Web服务器对其进行测试.该脚本基本上试图从给定的网站检索公用目录,并查看响应的HTTP状态代码,从而能够确定页面是否可访问.
首先,脚本将读取包含所有要查找的有趣目录的文件,然后以以下方式发出请求:

I'm writing my own directory buster in python, and I'm testing it against a web server of mine in a safe and secure environment. This script basically tries to retrieve common directories from a given website and, looking at the HTTP status code of the response, it is able to determine if a page is accessible or not.
As a start, the script reads a file containing all the interesting directories to be looked up, and then requests are made, in the following way:

for dir in fileinput.input('utils/Directories_Common.wordlist'):

    try:
        conn = httplib.HTTPConnection(url)
        conn.request("GET", "/"+str(dir))
        toturl = 'http://'+url+'/'+str(dir)[:-1]
        print '    Trying to get: '+toturl
        r1 = conn.getresponse()
        response = r1.read()
        print '   ',r1.status, r1.reason
        conn.close()

然后,解析响应,如果返回的状态码等于"200",则可以访问该页面.我已经通过以下方式实现了所有这些:

Then, the response is parsed and if a status code equal to "200" is returned, then the page is accessible. I've implemented all this in the following way:

if(r1.status == 200):
    print '\n[!] Got it! The subdirectory '+str(dir)+' could be interesting..\n\n\n'

对我来说,一切似乎都很好,除了脚本将其标记为实际上不是可访问的页面.实际上,该算法仅收集返回"200 OK"的页面,但是当我手动浏览这些页面时,我发现它们已被永久移动或访问受限.出了点问题,但我无法确定应该在哪里正确修复代码,我们将为您提供任何帮助.

All seems fine to me except that the script marks as accessible pages that actually aren't. In fact, the algorithm collects the only pages that return a "200 OK", but when I manually surf to check those pages I found out they have been moved permanently or they have a restricted access. Something went wrong but I cannot spot where should I fix the code exactly, any help is appreciated..

推荐答案

我没有发现您的代码有任何问题,只是它几乎不可读.我已将其重写为以下工作片段:

I did not found any problems with your code, except it is almost unreadable. I have rewritten it into this working snippet:

import httplib

host = 'www.google.com'
directories = ['aosicdjqwe0cd9qwe0d9q2we', 'reader', 'news']

for directory in directories:
    conn = httplib.HTTPConnection(host)
    conn.request('HEAD', '/' + directory)

    url = 'http://{0}/{1}'.format(host, directory)
    print '    Trying: {0}'.format(url)

    response = conn.getresponse()
    print '    Got: ', response.status, response.reason

    conn.close()

    if response.status == 200:
        print ("[!] The subdirectory '{0}' "
               "could be interesting.").format(directory)

输出:

$ python snippet.py
    Trying: http://www.google.com/aosicdjqwe0cd9qwe0d9q2we
    Got:  404 Not Found
    Trying: http://www.google.com/reader
    Got:  302 Moved Temporarily
    Trying: http://www.google.com/news
    Got:  200 OK
[!] The subdirectory 'news' could be interesting.

此外,我确实使用了 HEAD HTTP请求而不是GET,因为如果您不需要内容,而您只对状态码感兴趣,则效率更高.

Also, I did use HEAD HTTP request instead of GET, as it is more efficient if you do not need the contents and you are interested only in the status code.

这篇关于python http状态码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆