如何在python cgi中找出上载的文件名 [英] how can i find out the uploaded file name in python cgi

查看:56
本文介绍了如何在python cgi中找出上载的文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我做了如下简单的Web服务器。

i made simple web server like below.

import BaseHTTPServer, os, cgi
import cgitb; cgitb.enable()

html = """
<html>
<body>
<form action="" method="POST" enctype="multipart/form-data">
File upload: <input type="file" name="upfile">
<input type="submit" value="upload">
</form>
</body>
</html>
"""
class Handler(BaseHTTPServer.BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header("content-type", "text/html;charset=utf-8")
        self.end_headers()
        self.wfile.write(html)

    def do_POST(self):
        ctype, pdict = cgi.parse_header(self.headers.getheader('content-type'))
        if ctype == 'multipart/form-data':
            query = cgi.parse_multipart(self.rfile, pdict)
            upfilecontent = query.get('upfile')
            if upfilecontent:
                # i don't know how to get the file name.. so i named it 'tmp.dat'
                fout = file(os.path.join('tmp', 'tmp.dat'), 'wb')
                fout.write (upfilecontent[0])
                fout.close()
        self.do_GET()

if __name__ == '__main__':
    server = BaseHTTPServer.HTTPServer(("127.0.0.1", 8080), Handler)
    print('web server on 8080..')
    server.serve_forever()

在do_Post中

但是我不知道如何获取上传文件的原始名称。
self.rfile.name只是一个套接字
我如何获取上载的文件名?

But i can't figure out how to get the original name of the uploaded file. self.rfile.name is just a 'socket' How can i get the uploaded file name?

推荐答案

您在此处使用的代码很破损(例如,查看该全局rootnode ,其中名称 rootnode 无处使用-显然是半编辑的源代码,并且很糟糕。)

Pretty broken code you're using there as a starting point (e.g. look at that global rootnode where name rootnode is used nowhere -- clearly half-edited source, and badly at that).

无论如何,您使用的是什么形式的 client- 作为 POST ?如何设置 upfile 字段?

Anyway, what form are you using "client-side" for the POST? How does it set that upfile field?

为什么不使用普通的 FieldStorage 方法,如 Python的文档中所述文档?这样,您可以使用适当字段的 .file 属性来获取要读取的类似文件的对象或其 .value 属性可将其全部读取到内存中并作为字符串获取,外加字段的 .filename 属性可了解上载文件的名称。尽管更简洁,但 FieldStorage 上的文档却是此处

Why aren't you using the normal FieldStorage approach, as documented in Python's docs? That way, you could use the .file attribute of the appropriate field to get a file-like object to read, or its .value attribute to read it all in memory and get it as a string, plus the .filename attribute of the field to know the uploaded file's name. More detailed, though concise, docs on FieldStorage, are here.

编辑:既然OP已经编辑了要澄清的Q,我会看到问题: BaseHTTPServer 不能根据CGI规范设置环境,因此 cgi 模块不是非常有用。不幸的是,进行环境设置的唯一简单方法是从 CGIHTTPServer.py 窃取并破解一大段代码(并非旨在重用,因此,复制和粘贴编码),例如...:

Edit: now that the OP has edited the Q to clarify, I see the problem: BaseHTTPServer does not set the environment according to the CGI specs, so the cgi module isn't very usable with it. Unfortunately the only simple approach to environment setting is to steal and hack a big piece of code from CGIHTTPServer.py (wasn't intented for reuse, whence the need for, sigh, copy and paste coding), e.g....:

def populenv(self):
        path = self.path
        dir, rest = '.', 'ciao'

        # find an explicit query string, if present.
        i = rest.rfind('?')
        if i >= 0:
            rest, query = rest[:i], rest[i+1:]
        else:
            query = ''

        # dissect the part after the directory name into a script name &
        # a possible additional path, to be stored in PATH_INFO.
        i = rest.find('/')
        if i >= 0:
            script, rest = rest[:i], rest[i:]
        else:
            script, rest = rest, ''

        # Reference: http://hoohoo.ncsa.uiuc.edu/cgi/env.html
        # XXX Much of the following could be prepared ahead of time!
        env = {}
        env['SERVER_SOFTWARE'] = self.version_string()
        env['SERVER_NAME'] = self.server.server_name
        env['GATEWAY_INTERFACE'] = 'CGI/1.1'
        env['SERVER_PROTOCOL'] = self.protocol_version
        env['SERVER_PORT'] = str(self.server.server_port)
        env['REQUEST_METHOD'] = self.command
        uqrest = urllib.unquote(rest)
        env['PATH_INFO'] = uqrest
        env['SCRIPT_NAME'] = 'ciao'
        if query:
            env['QUERY_STRING'] = query
        host = self.address_string()
        if host != self.client_address[0]:
            env['REMOTE_HOST'] = host
        env['REMOTE_ADDR'] = self.client_address[0]
        authorization = self.headers.getheader("authorization")
        if authorization:
            authorization = authorization.split()
            if len(authorization) == 2:
                import base64, binascii
                env['AUTH_TYPE'] = authorization[0]
                if authorization[0].lower() == "basic":
                    try:
                        authorization = base64.decodestring(authorization[1])
                    except binascii.Error:
                        pass
                    else:
                        authorization = authorization.split(':')
                        if len(authorization) == 2:
                            env['REMOTE_USER'] = authorization[0]
        # XXX REMOTE_IDENT
        if self.headers.typeheader is None:
            env['CONTENT_TYPE'] = self.headers.type
        else:
            env['CONTENT_TYPE'] = self.headers.typeheader
        length = self.headers.getheader('content-length')
        if length:
            env['CONTENT_LENGTH'] = length
        referer = self.headers.getheader('referer')
        if referer:
            env['HTTP_REFERER'] = referer
        accept = []
        for line in self.headers.getallmatchingheaders('accept'):
            if line[:1] in "\t\n\r ":
                accept.append(line.strip())
            else:
                accept = accept + line[7:].split(',')
        env['HTTP_ACCEPT'] = ','.join(accept)
        ua = self.headers.getheader('user-agent')
        if ua:
            env['HTTP_USER_AGENT'] = ua
        co = filter(None, self.headers.getheaders('cookie'))
        if co:
            env['HTTP_COOKIE'] = ', '.join(co)
        # XXX Other HTTP_* headers
        # Since we're setting the env in the parent, provide empty
        # values to override previously set values
        for k in ('QUERY_STRING', 'REMOTE_HOST', 'CONTENT_LENGTH',
                  'HTTP_USER_AGENT', 'HTTP_COOKIE', 'HTTP_REFERER'):
            env.setdefault(k, "")
        os.environ.update(env)

可以进一步简化此操作,但要在该任务上花费一些时间和精力:-(。

This could be substantially simplified further, but not without spending some time and energy on that task:-(.

有了这个 populenv 函数,我们可以重新编码:

With this populenv function at hand, we can recode:

def do_POST(self):
    populen(self)
    form = cgi.FieldStorage(fp=self.rfile)
    upfilecontent = form['upfile'].value
    if upfilecontent:
        fout = open(os.path.join('tmp', form['upfile'].filename), 'wb')
        fout.write(upfilecontent)
        fout.close()
    self.do_GET()

...从此过着幸福快乐的生活; -)。 (当然,使用任何体面的WSGI服务器,甚至使用演示版会容易得多,但是此练习 对CGI及其内部结构具有启发性;-)。

...and live happily ever after;-). (Of course, using any decent WSGI server, or even the demo one, would be much easier, but this exercise is instructive about CGI and its internals;-).

这篇关于如何在python cgi中找出上载的文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆