如何从该站点获取最后一张桌子? [英] How to get the last table from this site?
问题描述
I'm trying to get the last table from this site with python. Below you find my actual trying to do it.
该表的名称为"DadosColocação,Nos Termos do Anexo VII daInstruçãoCVM n°400,2003年".
The table is named as "Dados Colocação, nos Termos do Anexo VII da Instrução CVM nº 400, de 2003".
lin_cvm_oferta = 'http://web.cvm.gov.br/app/esforcosrestritos/#/enviarFormularioEncerramento?type=dmlldw%3D%3D&ofertaId=MTE3NDE%3D&state=eyJhbm8iOiJNakF4T0E9PSIsInZhbG9yIjoiTVRVPSIsImNvbXVuaWNhZG8iOiJNUT09Iiwic2l0dWFjYW8iOiJNZz09In0%3D'
html = requests.get(lin_cvm_oferta).text
print(html)
当我打印html时,它没有任何数据.
And when I print the html it doesn't get any of the data.
我已经与Json成为朋友@JackFleeting一起获得的表格的第一部分帮助我解决了另一个问题(此处.但是我不想使用硒.
The first part of the table I already got with Json as my friend @JackFleeting helped me in this other question (here). PS: I know that there is a similar solution here. But I don't want to use Selenium.
推荐答案
此问题与您之前的问题不同-页面使用post
而不是get
方法.您必须使用浏览器中的developer/network/xhr工具来提取url和有效负载,然后像这样发布它:
This one is different from your previous question - the page uses the post
, not get
method. You have to use the developer/network/xhr tool in your browser to extract the url and the payload, and then post it like this:
import requests
import json
url = 'http://web.cvm.gov.br/app/esforcosrestritos/comunicado/getUltimoComunicado'
payload = {"id":931,"dataInclusao":"2016-05-20T09:26:00Z", "dataInicio":"2016-05-18T00:00:00Z","dataEnceramento":"2016-07-05T00:00:00Z", "numeroEmissao":1,"quantidadeSerie":140,"valorMobiliario":{"id":11,
"dataInclusao":"2015-12-01T00:00:00Z",
"descricao":"CERTIFICADOS DE RECEBÍVEIS IMOBILIÁRIOS - CRI",
"relacionadoFundoInvestimento":False,"situacao":"ATIVO"},
"tipoEspecie":{"id":3,"descricao":"Sem Preferência"},
"tipoClasse":{"id":4,"descricao":"Não Aplicável"},
"tipoOferta":{"id":1,"descricao":"Primária"},"tipoForma":{"id":3,"descricao":"Nominativa e Escritural"},"ofertante":{"id":1860,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"emissor":{"id":1859,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"lider":{"id":931,"nrPfPj":17298092000130,"dataRegistro":"1998-10-15T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12},"instituicoesIntermediarias":[{"id":1089,"nrPfPj":59588111000103,"dataRegistro":"1991-08-12T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO VOTORANTIM SA"},{"id":1090,"nrPfPj":90400888000142,"dataRegistro":"1990-12-20T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO SANTANDER (BRASIL) S.A."}],
"valorPrecoUnitario":"1.000,00","inativo":False,
"qtdValoresMobiliarios":0,"valorTotalOferta":0,"variasSeries":True}
headers = {'content-type': 'application/json'}
resp = requests.post(url, data=json.dumps(payload), headers=headers)
data = json.loads(resp.content)
print(data)
请注意,尽管站点的post请求本身使用小写字母,但根据您的IDE,您可能必须手动将boolean值更改为True
和False
(大写字母,如上所述).
Note that, depending on your IDE, you may have to manually change boolean values to True
and False
(uppercase, as I did above), although the site's post request itself uses lowercase.
这篇关于如何从该站点获取最后一张桌子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!