使用请求进行Python Web抓取-登录后 [英] Python web scraping with requests - after login

查看：80 发布时间：2020/9/20 7:31:04 python web-scraping beautifulsoup python-requests

本文介绍了使用请求进行Python Web抓取-登录后的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我下面有一个python请求/佳汤代码，可让我成功登录到url.但是，登录后，要获取我需要的数据，通常必须手动进行以下操作:

I have a python requests/beatiful soup code below which enables me to login to a url successfully. However, after logon, to get the data I need would normally have to manually have to:

1)单击第一行中的声明":

1) click on 'statement' in the first row:

2)选择日期，然后单击运行语句":

2) Select dates, click 'run statement':

3)查看数据:

这是我用来登录到上面的第1步的代码:

This is the code that I have used to logon to get to step 1 above:

import requests
from bs4 import BeautifulSoup

logurl = "https://login.flash.co.za/apex/f?p=pwfone:login"
posturl = 'https://login.flash.co.za/apex/wwv_flow.accept'

with requests.Session() as s:
    s.headers = {"User-Agent":"Mozilla/5.0"}
    res = s.get(logurl)
    soup = BeautifulSoup(res.text,"html.parser")

    arg_names =[]
    for name in  soup.select("[name='p_arg_names']"):
        arg_names.append(name['value'])

    values = {
        'p_flow_id': soup.select_one("[name='p_flow_id']")['value'],
        'p_flow_step_id': soup.select_one("[name='p_flow_step_id']")['value'],
        'p_instance': soup.select_one("[name='p_instance']")['value'],
        'p_page_submission_id': soup.select_one("[name='p_page_submission_id']")['value'],
        'p_request': 'LOGIN',
        'p_t01': 'solar',
        'p_arg_names': arg_names,
        'p_t02': 'password',
        'p_md5_checksum': soup.select_one("[name='p_md5_checksum']")['value'],
        'p_page_checksum': soup.select_one("[name='p_page_checksum']")['value']
    }
    s.headers.update({'Referer': logurl})
    r = s.post(posturl, data=values)
    print (r.content)

我的问题是，(初学者来说)如何跳过第1步和第2步，并简单地使用最终的URL使用选定的日期作为表单条目(下面的标题和表单信息)来更新和发布另一个标题? (referral header是上面的步骤2):

My question is, (beginner speaking), how could I skip steps 1 and 2 and simply do another headers update and post using the final URL using selected dates as form entries (headers and form info below)? (The referral header is step 2 above):

]

来自csv文件下载的网络请求:

Edit 1: network request from csv file download:

使用请求进行Python Web抓取-登录后 [英] Python web scraping with requests - after login

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用请求进行Python Web抓取-登录后 [英] Python web scraping with requests - after login

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭