如何从该站点获取最后一张桌子? [英] How to get the last table from this site?

查看:118
本文介绍了如何从该站点获取最后一张桌子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从此

I'm trying to get the last table from this site with python. Below you find my actual trying to do it.

该表的名称为"DadosColocação,Nos Termos do Anexo VII daInstruçãoCVM n°400,2003年".

The table is named as "Dados Colocação, nos Termos do Anexo VII da Instrução CVM nº 400, de 2003".

lin_cvm_oferta = 'http://web.cvm.gov.br/app/esforcosrestritos/#/enviarFormularioEncerramento?type=dmlldw%3D%3D&ofertaId=MTE3NDE%3D&state=eyJhbm8iOiJNakF4T0E9PSIsInZhbG9yIjoiTVRVPSIsImNvbXVuaWNhZG8iOiJNUT09Iiwic2l0dWFjYW8iOiJNZz09In0%3D'
html = requests.get(lin_cvm_oferta).text
print(html)

当我打印html时,它没有任何数据.

And when I print the html it doesn't get any of the data.

我已经与Json成为朋友@JackFleeting一起获得的表格的第一部分帮助我解决了另一个问题(此处.但是我不想使用硒.

The first part of the table I already got with Json as my friend @JackFleeting helped me in this other question (here). PS: I know that there is a similar solution here. But I don't want to use Selenium.

推荐答案

此问题与您之前的问题不同-页面使用post而不是get方法.您必须使用浏览器中的developer/network/xhr工具来提取url和有效负载,然后像这样发布它:

This one is different from your previous question - the page uses the post, not get method. You have to use the developer/network/xhr tool in your browser to extract the url and the payload, and then post it like this:

import requests      
import json  

url = 'http://web.cvm.gov.br/app/esforcosrestritos/comunicado/getUltimoComunicado'

payload = {"id":931,"dataInclusao":"2016-05-20T09:26:00Z", "dataInicio":"2016-05-18T00:00:00Z","dataEnceramento":"2016-07-05T00:00:00Z", "numeroEmissao":1,"quantidadeSerie":140,"valorMobiliario":{"id":11,
    "dataInclusao":"2015-12-01T00:00:00Z",
    "descricao":"CERTIFICADOS DE RECEBÍVEIS IMOBILIÁRIOS - CRI",
    "relacionadoFundoInvestimento":False,"situacao":"ATIVO"},
    "tipoEspecie":{"id":3,"descricao":"Sem Preferência"},
    "tipoClasse":{"id":4,"descricao":"Não Aplicável"},
    "tipoOferta":{"id":1,"descricao":"Primária"},"tipoForma":{"id":3,"descricao":"Nominativa e Escritural"},"ofertante":{"id":1860,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"emissor":{"id":1859,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"lider":{"id":931,"nrPfPj":17298092000130,"dataRegistro":"1998-10-15T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12},"instituicoesIntermediarias":[{"id":1089,"nrPfPj":59588111000103,"dataRegistro":"1991-08-12T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO VOTORANTIM SA"},{"id":1090,"nrPfPj":90400888000142,"dataRegistro":"1990-12-20T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO SANTANDER (BRASIL) S.A."}],
               "valorPrecoUnitario":"1.000,00","inativo":False,
               "qtdValoresMobiliarios":0,"valorTotalOferta":0,"variasSeries":True}


headers = {'content-type': 'application/json'}

resp = requests.post(url, data=json.dumps(payload), headers=headers)    
data = json.loads(resp.content)
print(data)

请注意,尽管站点的post请求本身使用小写字母,但根据您的IDE,您可能必须手动将boolean值更改为TrueFalse(大写字母,如上所述).

Note that, depending on your IDE, you may have to manually change boolean values to True and False (uppercase, as I did above), although the site's post request itself uses lowercase.

这篇关于如何从该站点获取最后一张桌子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆