麻烦的请求/美丽的汤 [英] Trouble with requests/Beautiful soup

查看：163 发布时间：2016/8/5 19:08:31 python beautifulsoup python-requests

本文介绍了麻烦的请求/美丽的汤的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想学习使用Python的SOM网络的特点，我想我会写一个脚本在我的大学登录到网页练习。起初，我写了使用的urllib2 的code，但用户alecxe好心使用请求给我提供了一个code / BeautifulSoup （请参见：<一href=\"http://stackoverflow.com/questions/35279961/website-form-login-using-python-urllib2/35280124?noredirect=1#comment58303224_35280124\">Website使用Python的urllib2 ）的形式登陆

I'm trying to learn to use som web features of Python, and thought I'd practice by writing a script to login to a webpage at my university. Initially I wrote the code using urllib2, but user alecxe kindly provided me with a code using requests/BeautifulSoup (please see:Website form login using Python urllib2)

我试图登录到网页。 http://reg.maths.lth.se/。页面功能，让学生一个登录表单和一个教师（我显然试图登录作为学生）。登录应该提供一个Personnummer这基本上是一个社会安全号码的相等的，所以我不希望我的张贴有效的数字。不过，我可以透露，它应该是10位。

I am trying to login to the page http://reg.maths.lth.se/. The page features one login form for students and one for teachers (I am obviously trying to log in as a student). To login one should provide a "Personnummer" which is basically the equivalent of a social security number, so I don't want to post my valid number. However, I can reveal that it should be 10 digits long.

我提供了（有一个小的变化最终打印语句）的code下面给出：

The code I was provided (with a small change to the final print statement) is given below:

import requests
from bs4 import BeautifulSoup

PNR = "00000000"

url = "http://reg.maths.lth.se/"
login_url = "http://reg.maths.lth.se/login/student"
with requests.Session() as session:
    # extract token
    response = session.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    token = soup.find("input", {"name": "_token"})["value"]

    # submit form
    session.post(login_url, data={
        "_token": token,
        "pnr": PNR
    })

    # navigate to the main page again (should be logged in)
    #response = session.get(url) ##This is deliberately commented out

    soup = BeautifulSoup(response.content, "html.parser")
    print(soup)

据因而应该打印张贴PNR后得到的页面的源代码code。

It is thus supposed to print the source code of the page obtained after POSTing the pnr.

虽然code运行时，它总是返回主页 HTTP： //reg.maths.lth.se/ 这是不正确的。例如，如果您尝试手动输入错误的长度的PNR，即0，您应该被引导到一个页面看起来像这样：

While the code runs, it always returns the source code of the main page http://reg.maths.lth.se/ which is not correct. For example, if you try to manually enter a pnr of the wrong length, i.e. 0, you should be directed to a page which looks like this:

位于网址 http://reg.maths.lth.se/login/student 的源$ C $ c是主网页的obiously不同。

located at the url http://reg.maths.lth.se/login/student whose source code is obiously different from that of the main page.

有什么建议？

麻烦的请求/美丽的汤 [英] Trouble with requests/Beautiful soup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

麻烦的请求/美丽的汤 [英] Trouble with requests/Beautiful soup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭