python:如何从下载按钮隐藏链接的网页下载数据? [英] python: How can I download data from the webpage where the link is hidden by the download button?

查看:289
本文介绍了python:如何从下载按钮隐藏链接的网页下载数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我想在这里下载数据:



我想使用python自动执行此操作,我可以指定日期等。



我找到

解决方案 c> exportData('excel')产生一个提交的表单。通过使用Chrome devtools和 Network 面板,您可以计算出使用的标题和发布数据,然后编写一个python脚本来提交相同的http请求。

 进口请求
url ='http://www.dce.com.cn/publicweb/quotesdata/exportMemberDealPosiQuotesData.html'
formdata = {
'memberDealPosiQuotes.variety':'a',
'memberDealPosiQuotes.trade_type':0,
'contract.contract_id':'all',
'contract.variety_id' :'a',
'exportFlag':'excel',
}
response = requests.post(url,data = formdata)
filename = response.headers.get '('=')[ - 1]
with open(filename,'wb')as fp:
fp.write(response.content)

有可能找到修改发布数据来获取不同数据的方法。例如,您可以包含年份和日期的字段:


可以通过逆向工程,试错法或查找某些文档。

 'year':2017,
'month':3,
'day':20


Suppose I want to download data here: http://www.dce.com.cn/publicweb/quotesdata/memberDealPosiQuotes.html

When click the button shown below, I got a .csv file:

I want to do this automatically using python where I can specify the date etc.

I find here that one can use pandas pd.read_csv to read data from webpage, but first one need to get the right url. However in my case I don't know what the url is.

Besides, I also want to specify the date and the contract etc. myself.

Before asking, I actually tried to the dev tool, I still can't see the url, and I don't know how to make it programatic.

解决方案

The javascript exportData('excel') results in a form that is submitted. By using Chrome devtools and the Network panel, you can figure out the headers and the post data used, and then write a python script to submit an identical http request.

import requests
url = 'http://www.dce.com.cn/publicweb/quotesdata/exportMemberDealPosiQuotesData.html'
formdata = {
    'memberDealPosiQuotes.variety':'a',
    'memberDealPosiQuotes.trade_type':0,
    'contract.contract_id':'all',
    'contract.variety_id':'a',
    'exportFlag':'excel',
}
response = requests.post(url, data=formdata)
filename = response.headers.get('Content-Disposition').split('=')[-1]
with open(filename, 'wb') as fp:
    fp.write(response.content)

It's probably possible to find ways to modify the post data to fetch different data. Either by reverse engineering, by trial and error or by finding some documentation.

For example, you can include fields for year and date:

    'year':2017,
    'month':3,
    'day':20

这篇关于python:如何从下载按钮隐藏链接的网页下载数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆