通过Python请求登录网站 [英] Login to website via Python Requests

查看：114 发布时间：2020/5/3 9:47:49 python login web-scraping python-requests

本文介绍了通过Python请求登录网站的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于一个大学项目，我目前正在尝试登录一个网站，并从我的用户个人资料中抓取一些细节(新闻列表).

for a university project I am currently trying to login to a website, and scrap a little detail (a list of news articles) from my user profile.

我是Python的新手，但是我之前在其他网站上才这样做.我的前两种方法会传递不同的HTTP错误.我考虑过我的请求正在发送的标头出现问题，但是我对该站点登录过程的理解似乎不够.

I am new to Python, but I did this before to some other website. My first two approaches deliver different HTTP errors. I have considered problems with the header my request is sending, however my understanding of this sites login process appears to be insufficient.

这是登录页面: http://seekingalpha.com/account/login

我的第一种方法是这样的:

My first approach looks like this:

import requests

with requests.Session() as c:
    requestUrl ='http://seekingalpha.com/account/orthodox_login'

    USERNAME = 'XXX'
    PASSWORD = 'XXX'

    userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36'

    login_data = {
        "slugs[]":None,
        "rt":None,
        "user[url_source]":None,
        "user[location_source]":"orthodox_login",
        "user[email]":USERNAME,
        "user[password]":PASSWORD
        }

    c.post(requestUrl, data=login_data, headers = {"referer": "http://seekingalpha.com/account/login", 'user-agent': userAgent})

    page = c.get("http://seekingalpha.com/account/email_preferences")
    print(page.content)

这将导致"403禁止访问"

This results in "403 Forbidden"

我的第二种方法是这样的:

My second approach looks like this:

from requests import Request, Session

requestUrl ='http://seekingalpha.com/account/orthodox_login'

USERNAME = 'XXX'
PASSWORD = 'XXX'

userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36'

# c.get(requestUrl) 
login_data = {
    "slugs[]":None,
    "rt":None,
    "user[url_source]":None,
    "user[location_source]":"orthodox_login",
    "user[email]":USERNAME,
    "user[password]":PASSWORD
    }
headers = {
    "accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language":"de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4",
    "origin":"http://seekingalpha.com",
    "referer":"http://seekingalpha.com/account/login",
    "Cache-Control":"max-age=0",
    "Upgrade-Insecure-Requests":1,
    "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"
    }

s = Session()
req = Request('POST', requestUrl, data=login_data, headers=headers)

prepped = s.prepare_request(req)
prepped.body ="slugs%5B%5D=&rt=&user%5Burl_source%5D=&user%5Blocation_source%5D=orthodox_login&user%5Bemail%5D=XXX%40XXX.com&user%5Bpassword%5D=XXX"

resp = s.send(prepped)

print(resp.status_code)

在这种方法中，我试图完全按照浏览器的要求准备标头.很抱歉提供冗余.这将导致HTTP错误400.

In this approach I was trying to prepare the header exactly as my browser would do it. Sorry for redundancy. This results in HTTP error 400.

有人有什么想法吗，出了什么问题?可能很多.

Does someone have an idea, what went wrong? Probably a lot.

通过Python请求登录网站 [英] Login to website via Python Requests

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

通过Python请求登录网站 [英] Login to website via Python Requests

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭