Python斜纹:下载可通过PHP脚本访问的文件 [英] Python twill: download file accessible through PHP script

查看:120
本文介绍了Python斜纹:下载可通过PHP脚本访问的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用twill来浏览受登录表单保护的网站.

I use twill to navigate on a website protected by a login form.

from twill.commands import *

go('http://www.example.com/login/index.php') 
fv("login_form", "identifiant", "login")
fv("login_form", "password", "pass")
formaction("login_form", "http://www.example.com/login/control.php")
submit()
go('http://www.example.com/accueil/index.php')

在最后一页上,我想下载一个Excel文件,该文件可以通过具有以下属性的div访问:

On this last page I want to download an Excel file which is accessible through a div with the following attribute:

onclick="OpenWindowFull('../util/exports/control.php?action=export','export',200,100);"

使用twill,我可以访问PHP脚本的URL并显示文件的内容.

With twill I am able to access the URL of the PHP script and show the content of the file.

go('http://www.example.com/util/exports/control.php?action=export')
show()

但是,将返回与原始内容相对应的字符串:因此不可用.有没有一种类似于urllib.urlretrieve()的方式直接检索Excel文件?

However a string is returned corresponding to the raw content: thus unusable. Is there a way to retrieve directly the Excel file in a way similar to urllib.urlretrieve()?

推荐答案

我设法将Cookie罐从twill发送到requests.

I managed to do it sending the cookie jar from twill to requests.

注意:我之所以不能使用requests是因为登录时存在复杂的控件(无法找出正确的标题或其他选项).

Nota: I could not use requests only due to an intricate control at login (was not able to figure out the correct headers or other options).

import requests
from twill.commands import *

# showing login form with twill
go('http://www.example.com/login/index.php') 
showforms()

# posting login form with twill
fv("login_form", "identifiant", "login")
fv("login_form", "password", "pass")
formaction("login_form", "http://www.example.com/login/control.php")
submit()

# getting binary content with requests using twill cookie jar
cookies = requests.utils.dict_from_cookiejar(get_browser()._session.cookies)
url = 'http://www.example.com/util/exports/control.php?action=export'

with open('out.xls', 'wb') as handle:
    response = requests.get(url, stream=True, cookies=cookies)

    if not response.ok:
        raise Exception('Could not get file from ' + url)

    for block in response.iter_content(1024):
        handle.write(block)

这篇关于Python斜纹:下载可通过PHP脚本访问的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆